Application: real and imagined link metadata

Application: real and imagined link metadata

I recently listened to an episode on patents on This American Life. One of the people being interviewed was talking about how there are several patents being registered for each new invention. The author Steven Johnson touches on this subject as well in his book Where Good Ideas Come From: The Natural History of Innovation. Ideas need to have the right conditions in order to flourish. So, regardless of who gets the credit (Otlet), the hyperlink had to be invented. One thing came out slightly different from the way it was envisioned.

Otlet envisioned links that carried meaning by, for example, annotating if particular documents agreed or disagreed with each other. That facility is notably lacking in the dumb logic of modern hyperlinks.

Both Bush and Otlet imagined a meaningful link. A relationship that does not only connect two records together but says something about the nature of the connection. Bush addressed this as a side note, something the owner of the Memex might want to do. For Otlet it was an integral part of the Mundaneum. I would argue that despite modern hyperlink in themselves being dumb, as Alex Wright suggests in his article about Otlet, the system of monitoring them is highly sophisticated.

google_links

This is what a Google search result looks like

When we click on search results on Google it might seem like we are following the link that’s listed in the results, but if we copied the link that we are actually clicking on, we would see something like this-

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCkQFjAA&url=http%3A%2F%2Ftechcrunch.com%2F2012%2F05%2F19%2Fhyperlinks-are-dumb-and-bleeding-money-how-to-ensure-yours-arent%2F&ei=grt3UuLqLeO0sASqhYCABg&usg=AFQjCNE6bsDmIeigsJe4B11AM8Nf4D2oFg&bvm=bv.55819444,d.eW0

This is how Google keeps track of extra information regarding this particular link, specifically in relation to whoever is searching. The way computers read this gibberish is like this: everything that follows the question mark (?) is read as pairs of key=value, separated by ampersands (&). Count the ampersands and you can find out how many pieces of information are transmitted with each link clicked.

facebook ad

An ad on Facebook

This type of metadata of links is easy to find. I’ll make the font size really small for this one. The link of a Facebook page through an ad on Facebook looks like this-

https://www.facebook.com/zipcar?ft[tn]=kC&ft[qid]=5942430062830077437&ft[mf_story_key]=6112134485815816105&ft[ei]=AQJmul_i25lBjkFNSN2Def2xXsSg8uExKzbPB_hXexxByu0rGgLyuKHocgquQAXpJpSbVVp95EkBDj7dTStaGZ8xvRAYxkLKuoo3ZyQIAINIGMLVwI62xVYdcRwIof2iEkkbhzbAfKPvG2rxB49C7V-U9AIUvj0EOSen9Q0ldbQg12iCNwCi7zEoaRGH6PFJChNBSKUjG4Xi4kTlP6zOYYcFtu82lfffRhyH2kgkj651cmDrp6KeQJElfOHxMb8ifJTo3RCnAVhyK7ON4bymIa9L4G1afcsM_8_y8xHG_9fyMldi3jKg853_5n1aBxnbRrMuOdCA1byIbhnxnyM53TjY99wnSMce61v1R1wv1x563PA2BpgVc6aCT9XnKF7eBPBcplO24asu7hE16d-vrTRaHG4YB0tcMFDhzwxWQ-An3Q-JQR4z7BmuctHxxfRxyoZNwNGSQiazP7nJTUCd_QwwqBgl-xHUa2O1qq5PYUl3nOOof3HgdnJQ-FyBjNaHLsKLfmFzNOjbEn-kcRmGkjeiA5gkrrdmOEQO5WZ8BFUk60Ffvu1pPpp6tKbCpfmoKrOANYXbpRtchIh245XTH0eqFyd4K9nUWVbfRQqhBCsM-Ycvkp9Gv4-50HFVmDTQv3WpO-5fc7ggLObncjlNS75F5tvS0bScl-Fid768_FVCRnB8hz18oh1bLUKitDLaipphkdQueQf2WpPUQ6NzwCL9F9MoJZJvcYTE3INiHRv0KfysTpqopFZD0JWJ-8_D0UGYmev3xZJgAxJ4DQvfKwmLkFEi1C-YbYBxcJO-Xs7xutvBrBOa32TkkAobFzwe9J98etW_CLgziePG2slhxSX1L1KOEYd06y6Temz3K9wYrZ8kirMiiXowg9SEKNdaeRe9Cbe3sJyVM4EmKRUyBXca94h1kqJyxx2FXtE14aVgO8nm4mLmdKHXLLRLymDFDcaK4osL_g-0NQrKA2wOrBH99IcpZStP2SDWL_x2XRfplABVatR9YD9XeTeQxdEUxMOmn1R9xoiRaVAxjJP1oo8LrXXCHWQGBJbrxdo2Exk7k3HBx-YtvwXOqIMrnApnKCmxLqJDv42Oxi1TKkiKdGDzyVFcBuzKMqbcxGYTVRWXuxwvLXNhQChdRgRKcymJTIJKWj_N-XhjMjsyoPGfqmfT5T–ss2bvbLsyQrgUu0hd85CevB7HJIe-ztij2TmzUSPeBiU9m7HJrD6ICO6Q3si2P5w7xCvtkgIhzF8-_liov5pmljBUxunaLs_HpABssi-3I-H_2H-kX8PDT4-5YBb28r5XxyoseMH50QhP69qdEyfAJ2REsVYOpzT9Bgmds98-wz0ugx8k5pH8_PoQ-Mx_AeoJxHFOYKh3Mnw9c3nTfVP6yo6znsRAyrkH4g4QisqJWTxWzNA80QC5UvXsGJpIMKoNPDQBYCTeeLTwmC4Q8Si39v1I6Xq4Cmaoz0PpCRWAlkzpgFqm-CZcKe_D9tmYnfRtOlLdpIc8qZy6N67c7-BVuS9e9V0uuuR-CfmLM9V_WXWRpjdCK-IFf7o9dfSTA8Z5q1cqwOwJQ3SowrSxFE0IsDmPIfQiUzO0HHDTDGFrMoTbPy7NeWmjobVtd6ef4qTDln19iT84cdTEDj-FX0fIwsSiwkdMUx0npZqnizTN-0yPecu3BZWRYszXyB8mI1od26aaHydy_uPNu9ntXgpTFCMmbip5MbbceBs9tzv7s9JIiwj4dEp8ZZDbvNxaCrSZ2X5poN-4re3LPemFYTN0qpNC6-JDB0-bAhbpp46wDKuRBzyne-3OlfzCe0wfFqmJ6B9tXXZrG5NkajAfIjJOlNlvSdIlkB6BJLKWkk1yGOfCpzJX8bC4yLnt_ko&ft[fbfeed_location]=1&__md__=1

In addition to all of this information, there is data about our location from our IP address, operating system version, browser version, bandwidth, language preferences, the time we stay on each page, and if we have an account with a certain website, all of this information is related to it as well. Sometimes we get access to this information, whether it’s by using Google Analytics to get statistics on our websites’ visitors, or tracking how many people clicked on links when we send out emails.

mailchimp

Mailchimp lets you see how many people followed links on emails you sent, as well as who and where they are.

Google_Analytics_7_New

Google Analytics reveals some of their link metadata

All of this spying is possible thanks to what Muhammad Haadi calls the era of relational databases. The way databases are designed allows for cross-references between records, representing different kinds of relationships: one to one, one to many, and many to many. Unfortunately, most people do not have the technical knowledge or desire to create and host their own databases. Instead, they use databases that were created by other people, usually private companies.

When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard … Occasionally he inserts a comment of his own, either linking it into the main trail or joining it by a side trail to a particular item.”

Naming the link and commenting on it is quite powerful. Unlike what Bush describes, the reality of the web is that the user builds the trail, and then bots and spiders sent by companies name it in their code book. Institutions, companies, and services are making various “orders “explicit by means of modeling data and storing it on servers in a normalized form. As “users”, we usually select names that are presented to us, instead of naming things ourselves.

okcupid_orientation

Sexual orientations on OkCupid

As Haadi explains, a table often models a type of data (person, building, transaction, fluffy object), and a schema specifies the attributes of that data, each attribute with its appropriate data type.

numeric-data-type1

Data types exist because of the way data is stored in memory. A number that is smaller than 256 (2 in the power of 8) takes 1 byte of memory. A floating-point number requires at least 4 bytes. Computers did not have gigabytes of memory in the past, and optimizing memory usage was crucial for getting the most of the hardware that was available (today developers are kind of careless about this, which leads to the software-hardware development race). These data types, which were developed to correspond to the physical space that memory, are still the basis for any database design.

pet_table

a MySQL table schema for pets

When it comes to entering data into tables of a database online, we usually do it using forms. Very often, a form mirrors a database table, at least partially. For example, in the Facebook database, there is probably a table for users, and in it there is a column for relationship status. That column points at a value in another table (remember we are in the relational database era) that contains different values for relationship status. Facebook is probably storing information about when we changed our status, from what to what, what other user it is related to (relational forever), etc. We are left to choose one value out of a few options that were decided for us in advance. The name of the relationship status in the Facebook’s relationship status table is probably of data type “varchar” with a limited character length. Because of the way the form mirrors the database design, when we select an option, we can only choose one or another.

facebook_relationship

Relationship status according to Facebook

There are many examples of this type of approach to design. The developers and designers of an online piece of software choose the type of input field depending on the type of data. Textual data is entered through a text field, an integer representing a relationship in the database is entered through a select input field, boolean data (true/false) such as a user preference is represented as a check box. Some applications grant the user more control by letting them create the values that could then be selected from, such as categories on WordPress or tags, but there is still a one to one relationship between data types and design.

styled_input_fieldselectcheckbox

I found one example where the interface for entering data is decoupled from strict data types. Ourgoods is a one on one barter network. When editing your profile, you use a widget to set where you are in a range between two things.

ourgoods

From Ourgoods.org edit profile page

This got me to think about the relational databases and links in a new way. Data types can be relational as well. What would it be like to create data types that go beyond merely representing memory requirements? They could be tied to design and allow for more fluid choices. I like Ourgoods’s approach so I made a diagram that shows examples of some more similar relationships and groupings.

newtypes

Imagined gradient data types and the “primitive” data types that they group together

What if when various content management systems let you create your own taxonomies, they would let you create situations where choices don’t have to be as strict as they currently are? What if links contained information that even goes beyond Otlet’s vision, and instead of only agreeing or disagreeing with each other, they fall on a range between agreement and disagreement? All of the metadata that’s available for links online at the moment is quantitative. It makes sense, since they are created in the context of targeting audiences for ads and creating marketing campaigns. However, the information we put online has layers of value to us that are qualitative as well. What would it look like if we were in control of defining the relationships between things and naming them?