Premises for a better search engine

For more than a decade now Google has been the dominant search engine, consistently used for more than 85% of global Internet searches according to NetMarketShare. During this time there have not been any serious attempts to challenge the market dominance of the company that claims to be on a mission to organize the world‘s information and make it universally accessible and useful. As outlined before (see this article) it is highly unlikely that Google will succeed in doing so and is indeed most likely to blame for the majority of ad-copy disguised as content flooding the world wide web. The fact that the most visible part of the Internet his influenced in this way by a for-profit entity should worry every thinking person.

Google is now the second largest company in the US in terms of revenue and that more than 90% of the company’s revenue are being paid by advertisers (expected to exceed $106 Billion in 2017)?

The following is an attempt to describe the fundamental premises needed for a sustainable information management strategy (“Search”) and potentially save the world wide web from the tragedy of the commons or flippantly put a concept for ‘building a better search engine than Google‘.

Premise One: Information is a natural resource and needs to be treated as such (see natural resource management). Like all natural resources no single person, entity, group or culture can claim exclusive rights to information. Just as physical access to water needs to be available to any human being the information how to get to this resource is inseparably attached to it.

Premise Two: Access to information is a human right. To protect and promote essential human interests, especially the unique human capacity for freedom (see Andrew Fagan) access to information has to be free. Censorship as well as monopolized information organization (as de-facto practiced Google) is hence a human right’s violation. ‘Right’ being synonymous of ‘legal’ and antonymous of both ‘wrong’ and ‘illegal’, every ‘right’ of any human person is ipso facto a ‘legal right’ which deserves protection of law and legal remedy irrespective of having been written into the law, constitution or otherwise in any country (Ipso Facto Legal Rights Theory).

Premise Three: Knowledge and access to information are the natural enemies of belief (paraphrasing Plato). Belief is the enemy of progress. Or in my own words: believe is simply the absence of knowledge. An effective information management system will be able to   identify and discard information that violate basic principles of  objectifiable reality or otherwise claim non-verifiable/falsifiable arguments and thus unscientific theory is not intrinsically false or inappropriate, however, as metaphysical theories might be true or contain truth, and are required to help inform science or structure scientific theories. Simply, to be scientific, a theory must predict at least some observation potentially refutable by observation.

Premise Four: Evolutionary organization of information cannot be democratic and must follow logic (i.e. peer review) not popularism. We are all “standing on the shoulders of giants” (Newton). No progress can be made without understanding the research and works created by notable thinkers of the past. – Social proof is anything but. Google’s philosophy that assumes that democracy on the web works is demonstrably false (read more here).  A functional information management system will employ Hebbian theory.  Just as biological neuroscience explains the adaptation of neurons in the brain during the learning process the same model can be utilized to describe a basic mechanism for “synaptic” plasticity in connected systems wherein an increase in synaptic efficacy arises from the presynaptic cell’s/nodes (connected ‘brains’) repeated and persistent stimulation of the postsynaptic unit.

Premise Five: Commercial interests corrupt and sway development. Consequently, the potential of connected systems and connected knowledge has been underutilized and de facto halted (altruistic) progress as the majority of Internet users have accepted a marketing driven presentation layer – essentially censorship – as ‘status quo’.

Premise Six: DNA before intent and projection. What is needed is a objectified classification of the human element (which I label as “DNA”) within the network. Intent (i.e. Google (“search”)) and projection (i.e. Facebook) are non-directional approaches. A directional approach requires to locate the user on more than just the location level but also include the level of education and knowledge etc.

better search engine than google

Premise Seven: Capturing the cognitive surplus. Cognitive surplus as used here extends over the element of crowd-sourcing by utilizing any type of  engagement with any type of medium that can be contextually measured hence assigning a qualitative element. What is needed is the utilization of the latent potential inherent in the utilization of information itself. Exemplary: access of specific information from a specific individual contains a qualitative measure more relevant than any Hyperlink; i.e. a research scientist spending time on a website containing information relevant to his field of expertise as well as his/her engagement with other (digital) information related contextually as well as chronologically.

Premise Eight: Discarded information carries value. There is a strong tendency of researchers, editors, and pharmaceutical companies to report/publish experimental results that are positive (i.e. showing a significant finding) but very few results that are negative (i.e. supporting the null hypothesis) or inconclusive (Publication Bias). Effective information management will have to include negative results.

Premise Nine: Promote viral distribution of successful concepts while building ‘herd immunity’ against the adaption of destructive or dysfunctional paradigms. Herd immunity describes a form of immunity that occurs when the vaccination of a significant portion of a population (or herd) provides a measure of protection for individuals who have not developed immunity. Herd immunity theory proposes that, in contagious diseases that are transmitted from individual to individual, chains of infection are likely to be disrupted when large numbers of a population are immune or less susceptible to the disease. The greater the proportion of individuals who are resistant, the smaller the probability that a susceptible individual will come into contact with an infectious individual. The concept transcends to information and its consumption by individuals.

Premise Ten: Subjugate linguistic barriers. Humans are regarded like the primates for their social qualities. But beyond any other creature, humans are adept at utilizing systems of communication for self-expression, the exchange of ideas, and organization, and as such have created complex social structures composed of many cooperating and competing groups from families to nations. Social interactions between humans have established an extremely wide variety of values, social norms, and rituals, which together form the basis of human society but at the same time, the diversity leads to misunderstanding and fear (of the unknown). An effective information management system will have to first overcome linguistic barriers before transcending into transfer of knowledge. Hence any approach starting at the semantic level will come short of this goal.

Premise Eleven: Create an effective marketplace for information exchange. Information is the ultimate ‘derivative’ of any asset. However, only a small fraction of information is available through organized market places, most of which shift the compensation to aggregation and distribution of the asset. An effective marketplace for information exchange will focus on the compensation of information creation and curation of information, hence putting the focus on the quality of information rather than its “liquidity” (accessibility).

Premise Twelve: Create an energy optimized information system that does not require new infrastructure investments. Each connected system must not only capture and disseminate its own data, but also serve as a relay for other system (or: nodes), that is, it must collaborate to propagate the data in the network (definition of a mesh network). Current ‘search engines’ are highly ineficient and add to the pollution of our environement. Performing two Google searches from a desktop computer can generate about the same amount of carbon dioxide as boiling a kettle for a cup of tea, according to new research. Though Google says it is in the forefront of green computing, its search engine generates high levels of CO2 because of the way it operates. When you type in a Google search for, say, “energy saving tips”, your request doesn’t go to just one server. It goes to several competing against each other. And it may even be sent to servers thousands of miles apart.

Premise Thirteen: Create an qualified Smart Mob collaboration tool (within the peer-to-peer layer) for impromptu response to crisis situation and to actively drive topic progress. A smart mob is a group that, contrary to the usual connotations of a mob, behaves intelligently or efficiently because of its exponentially increasing network links. This network enables people to connect to information and others, allowing a form of social coordination (The concept was introduced by Howard Rheingold in his book Smart Mobs: The Next Social Revolution).

google alternative

(Preliminary) Conclusion. What is needed is a search engine in form of an open source, independentdistributed, search network and storage system (“Wiki”) designed to utilize resources of all machines and all humans, including their relationship to the document (owner, user, contributor etc.) as well as their profile and expertise, fostering logic-driven (“evolution like”) progress through compensation of contribution, while overcoming artificial barriers such as culture and language in a mesh networked structure.

Prototype. We are planning on releasing a prototype before the end of the year which will initially combine the following elements:

  • browser/digital document viewer (based on an open source SDK – likely Chrome);
  • file sharing, based on
  • group settings and/or
  • user classification/identification through artificial intelligence (such as the one provided by ai-one).

If you feel like contributing to the effort contact me!