Wednesday, February 19, 2020

What defines culture

Here is a model. Culture is defined by how behaviour is evaluated. In the extreme, a culture is defined by the actions that are considered most honorable, and most shameful.

What is the consequence of this? It means the most succinct description of a culture is to say the most honorable and most shameful acts within this culture. Example, explaining hacker culture to my mom:

The most honorable thing a hacker can do is to make software that solves a problem that a lot of people (especially other hackers) have, and then give it away for free. The most shameful thing a hacker can do is to steal someone else's work and pass it off as their own. [1]

A description of science culture would be similar, replacing 'software' with 'insight'. What other cultures can we describe this way? Scouting culture:

The most honorable thing a Scout can do is to maintain a long and faithful service to the group, and to the well-being of the surrounding community and nature. The most shameful thing a scout can do is to betray the group, society at large, or destroy nature. 

I would say that scouting as a culture is less self-centered than other hobby cultures, in the sense that the well-being of things outside the culture itself, weighs so heavily on the scale. If you're reading this, objecting that "my hockey team helps the homeless in the community!", then you may be right that your group cannot be called self-centered. What I would ask as a follow-up question however is: "To what extent does helping the homeless raise your status within the hockey team?" It is this internal status measuring, rather than the outside world consequences, that I'm interested in right now.

(A common theme in all cultures taken as examples so far is that altruistic behaviour to the in-group is very rewarded, and egoistic behaviour is punished. Can we think of counterexamples to this? I would say that competitive cultures such as competitive mathematics and programming, and business and law are counterexamples. In cultures like this, it can be honorable to trick or outwit someone in the in-group at their expense. However, it usually comes with some unspoken rules about what is considered fair play. The culture here is more iron-law: it is more shameful to allow oneself to be brought harm to, than to be the one who passes out the harm. Is this the default? Actually, I doubt it. The most competitive cultures have something else in common: they have an external evaluating agent. One's 'score' is not set by the peers, but by a central authority. The honor passed out by the central authority is not absolute, but relative, and therefore a finite resource. So the cruelty of the most competitive cultures is the cruelty of zero sum games. I'm making a prediction: even prehistoric nomad cultures, that occasionally attacked and killed each other, were not as competitive as Harvard Law School. The bounded consequences of competitiveness at HLS does enable more egoistic behaviour, however.)

What can we say about online culture, on say twitter or reddit? Not much in general, since the same comment can either yield praise or raise a mob depending on which subreddit/ twitter cluster it is posted. So according to this model, online culture is very fragmented, which sounds about right.

Why are honor/shame markers good definitions of culture? For one thing, they are very actionable. They tell you what to do in order to be accepted by a culture. Do the honorable things, and do not do the shameful things. It is very direct. Even if the actionability is not relevant, for example if we are investigating an ancient culture, it is still very relatable. Knowing that cattle and horse theft was considered a worse crime than manslaughter in the Old West tells us more about what life was like on the frontier than seeing a bunch of boots and revolvers, I think.

A problem with culture-as-behaviour evaluation is that it is not easy to infer from archaeological evidence. Well, measuring culture as behaviour is perhaps not a mean, but an end, for archaeology.

Let's take a negative approach for a minute and consider: what measures are bad, or inefficient, at defining a culture?

Monday, February 17, 2020

The Value of Formalization

Suppose someone hires us to solve a problem that is currently being solved by trial and error, rules of thumb, or by professional tradition. We are supposed to solve this using math, or programming, or something similar. Or as we say between you and me: using formal methods. Here is a naïve description of where the value comes in to the problematic system when using a formal approach:

First, we take a description of the problem and turn it into an equation, or whatever. Then we calculate the logical consequence of the equation and Oh My God: look at the Result! Now we know; we put this piece of information back into our process helps us save money. 

Today, the "equation" part is almost always realized as software. But the thinking still applies: the value is expected to come at the end. Here is a more realistic version of what happens:

First, we take a description of the problem and turn it into an equation, but wait... this description contains a lot of holes. It's not that the computer has a hard time crunching the numbers, it's that we can't even tell it what to do in the first place. The way the problem is currently understood makes no sense, logically. Parts of the process that are thought to be clearly prescribed by rules actually require a lot of human judgement and discretion to work, and this is a large variational factor in practice. Before we can even begin to optimize this in software, we need to gain a more exact understanding of the process than anyone has had before. 

In this version, the value starts coming in at the beginning, and it's a different kind of value. A better understanding of one's process is not always actionable. On the other hand, it can be actionable in very unexpected and profitable ways. Knowledge has that property. Currently, knowledge has to pass through a person's brain to have a chance at being used at its full value. So when given the task of formalization, take the chance to discover something new about a familiar process. It's often not an option.

Thursday, February 6, 2020

What to remember

Prioritize Invariance


Only remember things that are true now, and that will always be true. An example of a statement that will not always be true:

Donald Trump is president.

An example of a statement that will always be true:

Donald Trump was elected president of the USA in 2016, and as of the 5th of February 2020, he held the position. 

So a statement that is not invariant can be made invariant by specifying a context to it. Why prioritize invariance? Simply put: so that you can trust your own memory. If you do not couple a conclusion with its assumptions in memory, you may miss when the assumptions change, and your knowledge is outdated. If you go to the movies and see The Wolf of Wall Street and come out unhappy, something like this can easily become a cached decision:

I do not like The Wolf of Wall Street. I will not agree to rewatch it.

Consider specifying this as an observation (about yourself)  instead:

When I came out of the cinema after seeing The Wolf of Wall Street, I remember feeling that the movie was too long. Also, the main character seemed unbelievable in the final 45 minutes.  

Suppose now that you are offered by a friend to rewatch The Wolf of Wall street some years later. The friend has a re-edit of the film with a shorter ending, that is meant to be more true to what really happened. With the first, emotion-based memory, you would not notice that this re-edit addresses exactly the issues you had with the movie. Furthermore, remembering things this way makes your opinions more interesting to others, since they are less subjective. Remembering reasons rather than personal general impressions makes it possible for people who hear your opinion to determine whether it applies to them as well. This is an example where the consequences are of little significance of course, but suppose the same thinking is applied to for example one's impression of a person, a new piece of technology, a travel destination, a political party, or a potential employer! One could easily miss things becoming desirable (or undesirable), despite having seen enough evidence to change one's opinion.

Years


When opening a book, the first thing one should do is to check the year of publication. Why? If you know the year of publication, you know the information that was available to the author when writing the book. How do you know what information was available in such and such a year? By knowing the year of publication of other books! So by remembering a linear amount of information (i.e. one integer per book), it becomes possible to consider a quadratic number of connections (i.e. between any pair of books).

The same argument can be applied to people (knowing their birth and death year) and movies (year of release). I have found it less useful to remember years for music. Perhaps it is because I do not know enough music, so that I have not been able to see the effects of the quadratic number of connections. Another reason may be that since I am not trained in music, I do not realize that there is an interesting connection between two songs. In that case, becoming more learned in music would open up a world of interesting observations and conversation topics for me. A third possibility is that the people who do claim that there are interesting connections between different pop music songs for example, are just kind of faking it. But even with this suspicion, it might be worth it to practice a better ear for music.

Another field where I do not find it particularly useful to remember years, is in technology. There are some key inventions such as the transistor in 1947, that one should know. A reason why it is not so useful to remember certain years is that the important inventions were often created over some time. So an approximate year or decade is fine. Big inventions also take time to reach their potential. An example is the transistor radio that was released in 1954, seven years after the transistor was invented. The first neural network was demonstrated by Frank Rosenblatt in 1958. But of course, the neural network was not very important until 2012. Another reason why years are not so important for inventions is that it is more specific to remember things about the invention itself, that explain why they were able to replace the previous way of doing things. The problem with the 1958 neural network was that it lacked multilayeredness and backpropagation. This was fixed around 1978. The next missing features to greatness for neural networks was perceptive fields and layerwise training, which was introduced in the early 1990s and 2006 respectively. The final touch was data augmentation, and increasing the total amount of data and training by a lot, which was done in 2012 after which the big break finally came.

Location


When someone tells me about a new development, one of the first things I ask is: who is doing this? Which company or university? This is often ignored in a news report of the new finding. The news segment may say "Italian researchers demonstrate such-and-such". Knowing that they are Italian is not enough: it is just as well to remember the university and/ or city. Why do I insist on this? 

One reason is it is more useful for networking. If you do read a good article from the University of Milan for example, and later meet someone in your field who studied in Milan, chances are quite good that they have met the authors of the article! This helps you ask more informed questions. It also shows that you have bothered to remember something related to them. In the best case they want to pay back in kind by asking about your research. 

However, I think the most important reason for remembering things by location, has to do with the concept of a mind palace. The idea of a mind palace is that you imagine this place with lots of different objects and textures and levels and whatnot, and you use the image of this place to connect different memories to each object. To me, using a mind palace seems so bogus. It seems totally arbitrary. Better then to connect things to an image that is not arbitrary at all: the map of the world! The world map is something that should be remembered anyway, so it costs very little extra. It also works well on different resolutions: one will know about more things in one's home country, but one also has a better idea of the geography of one's home country. 

Using the historical timeline as a third dimension to this world map-mind palace does also work very well. I don't see how it is possible to organize memories without having a good visualization of the historical timeline.  

Skeptical and Charitable

It is important to be skeptical. Being skeptical means that you do not accept things on hearsay. You do not accept a statement that contradicts your present beliefs without a demonstration that proves it. To a skeptical person, a proof is necessary. A person who is not sufficiently skeptical will accumulate false beliefs. Carrying a lot of false beliefs leads to contradictions in beliefs. Being used to contradictions makes you less sensitive to them. This makes it harder to notice important contradictions. Not being able to notice important contradictions makes it even harder to be skeptical. So it is a vicious cycle. A bad model of the world makes updating your model of the world harder, so be careful about what you put in yours.

Being charitable means that you are willing to accept a statement that contradicts your current beliefs, if given a demonstration. To a charitable person, a proof is sufficient. A person who is not sufficiently charitable will ignore data that contradicts their beliefs, despite there being good reason to accept the data. Such a person can miss good opportunities. So being charitable is also important.

Notice the two logical statements above:

  • If skeptical, then proof is necessary.
  • If charitable, then proof is sufficient.

So to a person who is both skeptical and charitable, beliefs are based the proofs that they have seen.

What if every time we were faced with an opportunity to choose to be skeptical, a red screen would flash saying "maybe you shouldn't believe this"? And what if every time we were faced with an opportunity to be charitable, a green screen would flash saying "maybe you should believe this"? In practice, one of the signs may fail to light up. If both signs fail to light up and we just ignore the data, what happens then? What is the default? Skeptical. Skeptical is the default. A person who ignores all data is trivially a skeptic.

Does that mean that if being skeptic is the default, should we make an effort to be more charitable, even if that means accepting some false beliefs? I think not. Having bad beliefs is worse than having no beliefs, because of the vicious cycle effect. Imagine being in a plane that is about to land in Stockholm, Arlanda. The captain says that they have lost the map they have of Arlanda, so instead they will use a map they have of Copenhagen, Kastrup. You would rather they just looked out the window then!

So much for the tradeoff between skepticism and charitability. Let's focus on increasing the total sum of them! How can we do this with limited resources of mental energy? An obvious thing that people don't do enough is "just google it". Especially things that have been believed for a long time, that one picked up on hearsay. Such cached beliefs can make one look very stupid, since it is so easy to look them up these days. Ok, suppose we are on google. Which source do we believe? Answer: the most upstream one. In matters of science, always go to the original article. In matters of politics and such, who to trust about who-said-what? Answer: don't bother with who-said-what.

Is the original article the most upstream source in science? No, the most upstream source is nature herself. Nature is the final arbiter, and I don't mean the journal. Asking nature is quite expensive in general. In programming and mathematics, we are more lucky. We can go through the proof step by step by ourselves, and check that everything is right. In programming, one can implement the algorithm, and check that the output is correct. Just checking the output from examples is worth less than a full proof. However, with a paper proof, there is the problem of self doubt: did I miss something? Currently, I have a higher confidence in the computer carrying out logical operations, than in myself.