Processing Posts – Data Archive Infrastructure Fall 2018

Processing post

Processing Posts

Laura PannekoekDecember 4, 2018One Comment

To understand global patterns of climate change across deep time, Shannon shows, researchers must study samples from ice cores, sediments, tree rings etc. She shows how ice, rocks, sediments, and soils become archival documents in themselves.

But she argues, for example, the preservation of ice cores as document has a deeply paradoxical nature. It is at once concerned with tracking and monitoring patterns of environmental change, but its intensely complex processes of collection, transportation, storage and requires an enormous amount of energy to ensure its refrigeration during all these stages. “Freezing ice cores to study climate change is a practice saturated with ironies,” she quotes Joanna Radin and Emma Kowal. In a similar vein, as data storage centers now make up a large chunk of global energy use.

Data and archiving (and perhaps infrastructure?) are central to the interface between environmental processes and policy, consciousness, and imagination. How do we negotiate the energy-intensive means of tracking environmental meltdown that is itself caused by the dependency on energy? Indeed, a cosmic irony!

dubious durability

Processing Posts

Margaret TiernoDecember 3, 2018One Comment

It is mind blowing to consider the scope of the lengths we have to go to in order to store even a relatively miniscule chunk of the information continuously being produced at astonishing rates. Not to mention how those rates are expected to rise as time goes on (“Five years ago humans had produced 4.4 zettabytes of data; that’s set to explode to 160 zettabytes (each year!) by 2025. Current infrastructure can handle only a fraction of the coming data deluge, which is expected to consume all the world’s microchip-grade silicon by 2040.” From The Rise of DNA Data Storage). Even more mind blowing to me is the development of artificial DNA technology used as more dense and durable data storage. As I was reading the “Archiving a Website for Ten Thousand Years,” they also mentioned DNA data storage and linked to an article linked here: https://www.sciencedaily.com/releases/2016/04/160407121455.htm. With this picture:

Captioned:

“All the movies, images, emails and other digital data from more than 600 basic smartphones (10,000 gigabytes) can be stored in the faint pink smear of DNA at the end of this test tube.”

Another image from the sciencedaily article states there will be enough digital data by 2020 to fill six stacks of computer tablets reaching to the moon. I don’t even know what to say about that. Still, I wonder how durable these new technologies actually are. As the “Archiving a Website for Ten Thousand Years,” article discussed, many of the time capsules people thought would remain stable for centuries have been lost to renovations and decay already.

Processing post

Processing Posts

Laura PannekoekNovember 27, 2018One Comment

In response to Christine Mitchell’s question on the balance between archiving vs. media archaeology, I see Shannon’s point how sometimes the historical conditions of playback matter less than other times. Yet Shannon’s other point about the meta-dimension to sound recording to historical subjects, that is, the sonic archival document is at once a recording of a historical event and a record of its own recording practice, becomes important when you’re talking about i.e. race, sound, and the archive. (like they’re doing at my school next month –> http://aihr.uva.nl/content/events/events/2018/12/entanglements-of-race.html)

For example, sound registers a bodily reality in addition to semantic meanings. The idea of the grain of the voice then gains a very direct political dimension. Also, the recording shapes how the sound is recorded in the first place (i.e. a condenser microphone records all sounds in the vicinity or just a single voice or maybe it drowns out high tones or low tones). A study of colonial sound archive shows that racialized listening practices span across the recording, distribution and archiving of the document. Maybe in studying sound and the archive in particular, media archaeology and archiving cannot be separate practices?

Processing Posts

Iltimas DohaNovember 27, 2018One Comment

I’ve been thinking about some themes that we’ve touch on a couple times: what is and isnt the archivable. I wonder what is the un-meta-able? And where metadata is partially a function of query and index, can we have non-text metadata? I imagine a scenario in which a performance is the subject. It happens on certain date, at a certain place, at a certain time. How does one capture the metadata of mood of an audience? Potentially, I could have a field recording of an audience conversing moments before a performance as metadata. For me, it also comes back to this question of recordings being a capture of a performance or an event, and if these are even inseparable. Are the noises, the coughs, the hums, even the silences a part of performance. Who gets to decide this?

And even if that much is figured out, which parts of an event are to be recorded faithfully, how do we faithfully preserve audio. In the readings, we see degradation of the material the archival medium and thus a degradation on transfer, moving from one material to another.

Makes me think of a piece by a class mate of mine, which was an iteration from the I am Sitting in a Room piece. He recorded his voice, played it back, and recorded the playback almost a dozen times over a variety of mediums (including recording the playback off of a soundsystem in a cafe!). What we get in return are the undulations of not only physical space, but the digital de/encoding process.

Digital Vocal Saturation //2018 by Parsons Senior Thesis

Original Machines + Contexts

Processing Posts

Layne WhittedNovember 26, 2018One Comment

Taking up Christine Mitchell’s question, “How important is it to access sound through ‘original’ machines and contexts, whether technological, architectural or other?” and Shannon’s point that while we can emulate certain aspects of an original listening experience, it might be impossible to recreate its climatic conditions, social context, and ‘appropriate modes of listening,’” I wonder what counts as ‘original.’ While, as Shannon notes, this question might not be relevant if differences in recording and playback don’t actually matter for the researcher or listener, I find it interesting that what constitutes an original experience for a given listener might have nothing to do with what constitutes an original experience as understood by the media historian or archivist.

What contributes to someone’s perception of a ‘sonic archival document’ as original? Even if we can’t recreate the social context, could the use of an original machine be sufficient? Is it ever valid to say that there is only one possible original experience for a given recording? Considering that there might be multiple speakers/performers, recorders, and listeners, whose experience is the ‘original’ one we’re trying to emulate?

Data recycling

Processing Posts

Liliana FarberNovember 25, 2018One Comment

I found fascinating to read about the history of machine voice recognition. A development that, in the search for an empirical standard of the American English language, eliminates the richness of its speakers’ origins, and subsequently, the history of the land the language is rooted; placing white males as the language standard speakers, while speakers from different backgrounds are unaccounted as they are understood as “unrealistic” or “abnormal”.

As this standardization of speech is the basis of current voice recognition and analysis technologies, it was disturbing to read the trials examples and the legal implications the “standard” language has when used to judge “non-standard” citizens. People with different backgrounds are not recognized as individuals, according to this methodology, but as interchangeable members of a community.

In the race of developing new exciting products, companies place more attention to technologies than to the used data, often employing existing, biased data without much regard of how it was collected, categorized, and scrubbed. We should stop recycling data.

Photo Collections

Processing Posts

Iltimas DohaNovember 20, 2018One Comment

I’m really drawn to the text that draws the two lines of thought in photographic archiving strategies. Firstly, I had only recently become aware of MCAD when discovering my program director is an alum, so its interesting to see it appear again. It also makes me think about how much Midwest design and art (and archiving!) is underrepresented against the larger coastal institutions, whose money is old and vast.

I’m also drawn to the idea of archiving images by their visual semantics. As a technologist, I always think about how machines “see” and process information: They are highly semantic! At their core, computers see one small block of color in a long single line of colors. With some math, they see edges, they see gradients:

The goal of computer vision, CV, is to get context. The largest competition/effort in CV is COCO , or Common Object in Context, it’s right there in the name! So, at least for machine learning, the visual semantic is the base and necessary component for contextual organization. I guess I’m curious if machines could implement the discourse of the document, or is that soley a human task? Could the NYPL scan their magazine clippings and be told what heading it is, especially when they have wild headings like Views from Behind?

Escaping Biases

Processing Posts

Lena ReitschusterNovember 20, 2018One Comment

Wolfgang Ernst concludes the short chapter asigned for today with the sentence: „Instead, digital data banks will allow audio- visual sequences to be systematized according to genuinely signal-parametric notions (mediatic rather than narrative topoi), revealing new insights into their informative qualities and aesthetics.“ (Ernst, p. 29) These ideas remind me strongly of the Linnaean project: to find intrinsic qualities in the object, picture or music piece to fulfill the dream of a „true“ and objective classification system.

But I think the critique of Diana Kamin is very important in this case, that even „under the most machinic of circumstances, the eye of the expert collector is smuggled in, with attendant biases and values“ (Kamin, p.331) due to the curated training sets that are used to enable a machine to “see”. Maybe the underlying question is: Will it ever be possible to escape the biases?

Voluminous problems

Processing Posts

Lydia NobbsNovember 20, 2018One Comment

Back in 2011, I visited an exhibition by photographer Erik Kessels at Foam, a photography museum in Amsterdam. It was an invitation to wander through rooms full of unordered mounds of printed photographs – every photo that had been uploaded to Flickr within a 24-hour period. At that time, the daily upload was around 1 million images. This was just before the total ubiquity of smart phones, before Facebook acquired Instagram, around the start of photo sharing becoming a core component of communication. As a mechanism for appreciating the volume of images generated, the exhibition was both memorable and formidable.

24hrs_of_photos — Photo: Erik Kessels, http://www.kesselskramer.com/exhibitions/24-hrs-of-photos

According to some stats from earlier this year: 300 million photos are uploaded to Facebook every day, 95 million images and videos are uploaded to Instagram, and a total of 4.7 trillion photos were stored digitally by the end of 2017. And of course, there’s an exponential curve there.

Photo: Erik Kessels, http://www.kesselskramer.com/exhibitions/24-hrs-of-photos

When I think back to those rooms, and how daunting the volume was even then, I viscerally felt Tagg’s comment about “the danger of being entirely submerged if the other cameras follow suit and the stream becomes a deluge.” There is no solution to the challenge of ordering and archiving such image-based data that does not involve some “archiving machine,” whether submission is to the logics of a filing cabinet or photo recognition AI.

Spielgman’s “Words: worth a thousand” seems a quaint account of the problem of ordering pre-digital images. The senior librarian’s comments that the “indexing will become more rational when we go to digital storage” seems a radical simplification of who’s definition of “rational” will have deciding power of ordering.

Perhaps it is the time of the semester, and having to deal with my own problems of “overaccumulation” of information, but surrendering to the convenient tyranny of AI suddenly seems to make sense. Yes, all ordering, codifying, archiving, will “make us ask what we have lost of our being to archival machines” – but there was a “certain lack of precision” in human ordering of pre-digital photographs too. We have never been in control of our data, some machines just give us the sense that we are.

Processing post

Processing Posts

Laura PannekoekNovember 20, 2018One Comment

What new type of research, or what new type of statements can be made from an algorithmically generated (or processual) archive? That’s definitely something worth thinking about, but when Ernst argues that this means that the audiovisual archive “can for the first time be organized not just by metadata but according to proper media-inherent-criteria – a sonic and visual memory in its own medium” (28) it seems also, for us, like a step back into thinking again the archive as a transparent source.

He places the archival medium then again somehow outside of politics, as some objective mechanism before human perception, as “a genuinely code-mediated look at a well-defined number of information patterns.” (29) I get that the strength of his argument comes from the idea that digitization to some extent strips medium specificity, at least for the machine, and subjects it to code. When both sound and images become binary the archival implication is that human tastes and distinctions become less important. The archival medium becomes the first archaeologist, historian or researcher before its human user. I’m not sure “machine objectivity” is exactly what Ernst is after, but I argue we should be wary not to fall back into “algorithm = objectivity.” Like Kate Crawford’s has shown in the face-recognition bias talk. It’s the same old bias – but now hardcoded.

Processing Post

Processing Posts

Aarati AkkapeddiNovember 20, 2018One Comment

“Javitz sharply criticized his ideas, cautioning that his approach required a subjective appraisal riddled with personal aesthetic bias that would endanger the objective, impartial study of images.”
This was funny to me because if one visits the NYPL Picture Collection, the categories are quite subjective. For example, in an image of the moon in the night sky, how does one determine if the picture should fall under moon, or sky?

The two genealogies of image classification discussed in Kamin’s piece also made me think of the different computer vision techniques used in analyzing images and if they might fall into either category. For example, I might categorize object detection under Javitz’s line of thought, whereas something like the watershed algorithm (which views the image as almost a black and white topographical map seeing brighter areas as elevated points) better fitting with Karpel’s philosophy. Regardless of this binary categorization, I think the notion that there are many, many ways to analyze an image is interesting and also carries over into Computer Vision.

Ethics and justice in archiving?

Processing Posts

Laura PannekoekNovember 13, 2018One Comment

After reading the Caswell interview, I am not sure if ethics and justice are the right language to discuss the archiving of complex and morally ambiguous social issues. She says that for her, social justice and archiving “100% overlap”, and this works when she talks about issues of representation of minority groups. But when the issues get more tricky, archiving for social justice or archiving social injustice becomes difficult to talk about in terms of rights and wrongs. What does it mean to document contemporary German Islamophobia ethically?

Caswell calls for the ethics of care, in part as a response to the abstract “metaphor” of The Archive. While I understand some of the issues archivists and archival scholars must have with this type of theory, it seems to me that archiving for “a more just world” is a lot more abstract than the Foucauldian notion of the archive. To me, it seems a lot more useful to ask what statements, studies, or claims could arise from the way in which Islamophobia is documented, rather than to consider it in terms of social justice.

Acknowledging and Addressing Archival Injustices

Processing Posts

Iltimas DohaNovember 13, 2018One Comment

I am surprised to read in Caswell interview that archivist are resistant to online records, and that records require materiality. Perhaps I am missing a key difference between an archive and a record, but that seems to exclude a vast amount of data. As mentioned, it excludes oral and kinetic records, but does that also exclude databases/online records?! What are the characteristics of a dataset that would make it a record? And in thinking about archiving radical movements, I also struggle to see how we can only stick to materiality where entire political struggles are started and maintained through hashtags.

I’m also interested in the ethics of cataloging radical movements. Like the participants of On Our Backs, should protesters be subject to having their dissidence be preserved? Is there an intersection of anonymity and accuracy + authenticity of archives?

And a last thought on radical digital archives, with the advent of deep fakes, professional trolls, and misinformation, what are the ethics of including, for example, false tweets and misinformation? On one hand, presenting that “data” on equal ground with legitimate data is problematic. On the other, those campaigns should be documented as a part of fighting these struggles.

I guess this week brought more questions than it did answers…

Processing post

Processing Posts

Aarati AkkapeddiNovember 13, 2018One Comment

The points brought up in all of these readings/talks/interviews support my thinking that the decolonization of archives is much more complex than an ‘undoing’ of archival injustices. It is not simply a matter of repatriation or ownership, a returning of materials to where they come from. The very methodologies used in colonial archiving practice, (for example as Christen brings up, the viewing of indigenous/colonized peoples as a subject of study rather than collaborators), have enduring effects on the categorization, preservation, metadata, and dissemination of these artifacts even in today’s context. Moving forward, I would also like to linger on the question of what non-western archival practices look like. Caswell several times in the interview draws a strong dichotomy between western and non-western archival thought. Particularly with the notion of subjectivity. “Records are supposed to be impartial, which means that the people creating them should have no notion of how they might wind up in an archives in the future.” This is an important distinction because all of these readings argue that archivists should have respect for the intended visibility, distribution and preservation of artifacts during their creation (i.e. the intended illegibility of certain rap lyrics for particular audiences (Doreen St.Felix), or the right to be forgotten).

Mindsets & Toolsets for Self-Archiving

Processing Posts

Jed CrockerNovember 12, 2018One Comment

I was struck by the recurring themes of awareness, empowerment and the efforts to provide tools to communities to archive themselves that ran throughout the material for this week. The interview with Michelle Caswell provided several examples of this in her own work and in those of who have inspired her — all stressing the importance “to use the same language [in archival projects] that communities use to describe themselves.” She builds on this in the following article regarding models to employ “radical empathy” and core tenants of social justice in archival practice. These sentiments are expanded upon in Kimberly Christen’s work with Traditional Knowledge licensing and labeling systems for use in the handling of indigenous cultural digital materials. I was particularly interested in the iconography of the TK labeling system that was highlighted in this work — using visual cues to potentially expand the reach of this system through educational/social channels. Burgis Jules, in the “Failure to Care” panel discussion, neatly and succinctly articulated these efforts via his interest in the “usability of data archiving tools as a way to diversify the historical record.”

From the same panel discussion, I am also interested in Doreen St Felix’s comment the griot as a sort of “ghoulish” figure in West African culture/society. This role of musician/historian/storyteller is another example of embodied archival knowledge via the distribution of oral history. I hadn’t previously considered or known about the darker contexts/association of this cultural figure.

Lastly: Evan Hill’s article — focusing on the Mosireen archive project “858” which documents smartphone videos of the Egyptian protest movement in 2011 — makes a keen observation in its conclusion: “We say the internet never forgets, but internet freedom isn’t evenly distributed: When tech companies have expanded into parts of the world where information suppression is the norm, the have proven wiling to work with local censors. Those censors will be emboldened by new efforts at platform regulation in the US and Europe, just as authoritarian regimes have already enthusiastically repurposed the rhetoric of “fake news.”” The subject of intense moderation of major social media and networking platforms is the focus of the highlighted film on this week’s Independent Lens on PBS — The Cleaners, by Hans Block and Moritz Riesewieck. (I haven’t watched it yet!)

Activism and Resistance in the Archive

Processing Posts

Lena HansenNovember 12, 2018One Comment

In Cole and Griffith’s interview with Michelle Caswell, she explains her understanding of archives and archiving as “infused with a social justice ethics” (Cole and Griffith 24). Unlike in a museum context where the form and aesthetic of an object is prioritised, the archive inherently contextualises the material. I understand this to be the opportunity and space for resistance and social justice. For example, in archival metadata the same language and terminology used by a community can also be used. While this allows for those in the community to access this information, depending on the context and archival institution, the decision to use a different language or specific term can in itself be taking on a political stance.

In Christen’s article, independent web portals and digital archives in which tribes have control over databases and the creation of records are said to “deliberately position Indigenous communities themselves as the owners and custodians” (Christen 4). This deinstitutionalized archive and Caswell’s post-custodial archive, have been criticised for its structure but also its status as an archive has been called into question. Is an archive just a collection of records? What is it about these archives that threaten the authority and legitimacy of an institutional archive?

The Limits of Memory

Processing Posts

Tim MurphyNovember 12, 2018One Comment

I found this week’s reading assignments interesting because they highlighted that archiving itself can be a revolutionary act. Expanding upon the notion we have been working with that ‘archiving is memory’, the simple act of documenting and archiving populations and events, especially disenfranchised or ‘non-mainstream’ populations, provides a voice for the voiceless within collective ‘memory’. As Caswell states, “Fundamentally to me, the
act of remembering and forgetting is about creating a future in which resources are more equitably distributed. For me, archival labor should be infused with a social justice ethics.” The ‘act of remembering’ is in itself a powerful tool, and a tool the archivist can use to turn ‘memory’ into something more permanent and tangible.

Looking at topics like the Arab Spring or indigenous peoples can give us a look into the type of populations that in an earlier time would be undocumented or even forgotten. In both the analog and digital ages our ability to archive has been limited; limited by technology, by public or private interests, by political or religious taboo, etc. Referring to the Arab Spring Even Hill states, “One broken link at a time, one of the most heavily documented historical events of the social media era could fade away before our eyes”. Just as our human brain is limited in the quantity and detail we can remember, so is our ability to ‘remember’ in our archives – both in analog and digital environments.

Inviting “the other” + invisibility politics

Processing Posts

Allie MularoniNovember 12, 2018One Comment

Having been raised by a single mother, my understanding of domesticity is imbued with certain “feminist ethics.” I was particularly moved by Caswell and Cifor’s idea of radical empathy, one that involves a kind of hospitable guidance of “the other” in archival interventions (2016, p. 25). At the risk of extolling midwestern friendliness, I take this invitation of the other to mean the potential to bring together disparate, perhaps even incompatible, articulations of “care.” Failed attempts to ethically preserve cultural knowledge reveals the collective tendency to efface the granularity of these archival materials. However, the power relations enfolded into politics of invisibility complicate the right to privacy. As Doreen St. Felix notes, some work is produced with an intended illegibility: “not every artist wants everyone to understand.”

What does become clear in the digital landscape is that cultural material produced and preserved online faces more questions than those cared for in historically private spaces like the home. We cannot domesticate the Web.

What’s In A Name?

Processing Posts

TressNovember 11, 2018One Comment

In September 2017, during a Twitter Q&A session, #AskACurator, hosted by the British Museum, curator Jane Portal tweeted: “We aim to be understandable by 16-year-olds. Sometimes Asian names can be confusing – so we have to be careful about using too many.” The Museum has since issued an apology.

This type of cultural imperialism in the Western world is so ingrained in archival practices that museums can sometimes feel like less of a place to acquire cultural knowledge and more of an institution of capital-gaining cultural appropriation and superficial gandering. That tweet demonstrates some of the themes in this week’s readings, namely the discussion about the lack of including a community in an effort to create an archive pertaining to their heritage, and the effort to name items in the vernacular of said group. This also raises a question mentioned in the Digital Social Memory panel, who is the audience?

When a name is changed or taken away it strips the cultural significance of the item; it replaces the bodies and histories attached with a centerpiece, a decoration; it panders to those who are not affected by the misrepresentation. I wonder what is the full range of dangers or limitations of uninformed naming practices in archives? What are the politics of inclusion when an archivist so steeped in their own hegemonic viewpoints only considers the audience and neglects the bodies which created the content?

Flickr: Collective Photographic Memory

Processing Posts

Philipp SchmittNovember 8, 2018One Comment

I was struck by the connection between the readings and a controversial announcement by online photo sharing platform Flickr earlier this week: In February 2019, the platform will delete any pictures in excess of 1,000 photos per user, unless they upgrade to a paid account. Since 2013, the site had offered 1 terabyte of free storage per user, or around 100,000 photos (assuming 10 megabyte per photo, which is larger than most).

For a long time, Flickr was a safe haven for some great, many mediocre and countless terrible photographers whose oeuvre I won’t mourn. However, seeing their “photographs as records, first and foremost, not as aesthetic objects or art”, as Caswell argues, demonstrates what’s at stake: I don’t know how many of the platform’s ~ 7 Billion images will be deleted in February but, and here Evan Hill joins the chant, “at stake is nothing less than our collective memory.”