The World Economic Forum estimates that 463 exabytes (1 exabyte = 10^10 bytes) of data will be generated per day by 2025. For comparison, that's about 1 byte for every milliliter of water in an Olympic swimming pool.
We like to think of an increase of information as a good thing. After all, a sizable amount of the work that happens in the modern information economy is moving data around to the right places. It certainly seems like an increase of data would help us get more of this work done.
Unfortunately, history tells us a different story. Increases in the availability of information often come with negative unforeseen consequences.
In the 70s and 80s, the rise of the computer age disappointed many by causing a decrease in productivity growth rather than the increase that we hoped for.
Indeed, even in the field of artificial intelligence, increasing the amount of training data to learn from can cause neural networks to perform much worse.
The data that we're generating is less like a gentle drizzle and more like a fire hose that's being opened further every day. Those who wish to drink from it are left wondering what to do to prevent ourselves from drowning?
The Prophet and The Machine
In July 1945, just 3 months prior to the dropping of the atomic bomb, an article was published in The Atlantic by Vannevar Bush called, "As We May Think". Bush recognizes the same problem that we do, namely that the amount of data that humanity has accumulated is beginning to outstrip our capacity to consume it.
The solution he describes is a hypothetical device called a Memex, a large electromechanical device about the size of a desk.
The Memex has a large reel of re-writable microfilm that documents can be photocopied into and read from. Each document would be given a tag and documents could contain tags for other documents, allowing the user to quickly jump back and forth between linked documents and to create trails of documents for revisiting later.
Bush writes that this device would serve as a different kind of library. One that indexes by association rather than alphabetically or by topic.
You may notice that this device sounds astoundingly similar to the modern world wide web. Bush does not stop there, the predictions that he later makes about the device are near prophetic in their accuracy.
"Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails running through them, ready to be dropped into the memex and there amplified. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client's interest. The physician, puzzled by a patient's reactions, strikes the trail established in studying an earlier similar case, and runs rapidly through analogous case histories, with side references to the classics for the pertinent anatomy and histology. ...
The historian, with a vast chronological account of a people, parallels it with a skip trail which stops only on the salient items, and can follow at any time contemporary trails which lead him all over civilization at a particular epoch. There is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record. The inheritance from the master becomes, not only his additions to the world's record, but for his disciples the entire scaffolding by which they were erected."
Wikipedia, WebMD, Ancestry.com, Dropbox, and Reddit all have their conceptual origin in Bush's predictions. Indeed, the entire web owes a debt to Vannevar Bush with Tim Burners-Lee crediting Bush with the creation of the hyperlink and Douglas Englebart using the Memex as in inspiration for the creation of the personal computer and mouse.
Fulfilling Vannevar's Vision
The Memex allows us to handle the information fire hose by providing structure to our data. The associative indexes that it forms makes searching for info drastically more efficient.
However, there is a key aspect of the Memex that remains unfulfilled by modern technology. Bush envisioned his device as a personal system, hoping that we would take control of our own data. The modern web really only optimizes for consuming other people's, often useless, content.
Fulfilling this last step is the driving purpose behind Stax.ai, we aim to help people to get a handle on their massive quantities of data. The Memex's linked microfilm documents are replaced by scanned images that are OCR'd and indexed by our database. The trails of connections are replaced by categories of documents created by you and learned by our AI. So let us help you get a handle on your data and use the fire hose of data to your advantage.