Facts and Hypes

Some years ago “Big Data” was the magic word in ICT. Little by little “Big Data” had to tolerate “Artificial Intelligence” as a contender alongside it as well. The intrusion of English terms like these into the Dutch language seems unstoppable, but that aside. More important is that there is something characteristic of both: they're missing something. And that becomes time and again painfully clear, when I have to struggle through the huge amount of mails sent to me.

Why does the parsing and processing of that pile of mails feel like a struggle? Because in each email-program I have used so far, I have been missing the most essential function. I can sort my mail in many ways except for the only way I would really like to: on relevance. In Outlook I can sort on “importance”, but that is not what I am after. For in the first place, importance is marked by the sender and not by me as recipient; and what a sender considers important, is definitely not always important to me. In fact, sending a mail to me with high importance has an adverse effect on me: they are preferentially processed at the end of my struggle.

There is another thing that is weird about the “importance” marker: it doesn't change over time. The other day I received a mail that related to a meeting. This mail did indeed require immediate action, which would have given it high priority to me. Unfortunately, I only saw that mail after the meeting. Consequently, its relevance had vanished completely. If my email-program had really been helpful to me, this mail would not have been shown to me at all. After all, the relevance to me had dropped from top priority down to completely irrelevant. That makes a static marking pointless all the same.

An interesting question is now: why is it not possible for my email-program to sort the mail on relevance for me? After all, we do have “Artificial Intelligence”, don't we? If I, a human being, can classify on priority effortlessly, then that should surely be (just to keep away from the close to inevitable 'piece of cake') a five-finger exercise for a computer as well? If a computer can beat the world champion of Go, than a thing like sorting on relevance should surely be relatively easy?

Unfortunately, the last two questions still have to be answered negatively. A computer is not capable of doing such things. Not yet, anyway. And that has everything to do with what we referred to with "they're missing something", and that something is called “semantics”. Simply put: what my mail means is not yet clear to the computer. Thanks to Big Data we are flooded with a storage lake full of data, but the meaning of it, the semantics, still remains hanging in the air impalpably. The computer must also be capable to decide what is important for me. That requires interpretation. And computers are still very bad in interpretation. This is linked, among other things, to how ICT professionals approach their field of expertise: great attention is devoted to the ‘T’ in ICT, i.e. the technology. But there is a lot less attention for the ‘I’: the information.

What makes something information? Let's go and ask the largest source of information we have available. The worldwide web brought us Wikipedia, and therein we can read: "Information (from the verb informare: to inform in the sense of "to give form to the mind", "to discipline", "instruct", "teach") is related to notions of communication, perception, education, and knowledge, and can be thought of as the resolution of uncertainty." Wikipedia continues with “Strictly speaking information is information if and only if it is interpretable. Interpreting and integrating information will provide knowledge”; and that is precisely what we called the semantics.

Beside information we also have the concept of data. The difference between data and information is precisely the semantics. Data becomes Information through their coherence in a certain context and representation, which can be expressed in concretely formulated facts; and that is the semantics.

That semantics is a phenomenon that will be hard to automate is evident. But the solution comes into view if we focus in IT on the ‘C’ that has been dropped from ICT: Communication.

For by capturing communication and interpreting it, we can make a start with digitally capturing the semantics. The trick is that we must convert communication into facts. Once we have taken that step, an ideal picture starts to emerge: ‘Big Semantics’.

In this ideal picture information and my interpretation of it will also be known to the computer. With that the computer will understand - in the form of an algorithm - the relevance of mails sent to me and can help me to determine the priority of my mails. If the algorithm cannot interpret things, because it cannot distil recognizable facts from the context of a given communication, it should take this to the only person that is capable to answer the issue in a definitive way: me personally. By asking the right questions the algorithm allows me to designate the relevant facts in the given communication. So what it boils down to is this: the interpretation of the relevant facts from the communication should be facilitated by means of … communication. In fact, it is all about composing my weltanschauung through my interpretation of the facts in the given communication.

Of course, this insight is not just limited to mails; it applies to the complete abundance of information that reaches us by any other means. E-mail only served here as a simple example. Perhaps the continuous interpretation of all communication brings along “big brother” associations. But I and I alone am the true owner of my fact-based interpretation. This interpretation should therefore be monitored carefully and only made available to me personally, lest I be drowned helplessly in the 'tidal wave' of communication. And that goes for every single end user of any abundance of information.

If we continue on this path, and manage to move the focus of the technology in IT to information through communication, then we are on the verge of a new era, in which we will truly be supported by computers, simply because they understand us. Just to put it in plain English: ‘The Era of Big Semantics’.




