Articulating Computer-Generated Poetry

A few days ago, a poetry podcast that I’ve never heard of before – Commonplace: Conversations with Poets (and Other People) – posted an episode in conversation with Allison Parrish. Allison Parrish is one of the big people in computer-generated poetry. Even if you’ve never come across her name, you may have come across one of her many artsy projects. From 2007 to 2014, for example, Parrish managed a Twitterbot that tweeted ‘every word in the English language’. Her Deep Question Bot regularly throws ridiculous (but at times quite thought-provoking) questions into the Twittersphere. She’s contributed some gems to National Novel Generation Month (I Waded In Clear Water is one of my faves). This particular interview was in light of her recently-released full-length print book, Articulations, which is part of the Using Electricity series of computer-generated print books.

So Parrish is identified as a poet, but her poetry is all produced through digital media that actually produce the poems for her. She’s using natural language generation for aesthetic purposes, rather than for the more pragmatic purposes for which it is usually applied: for example, data-drivennews articles. Sure, computer-generated news articles are cool, but poetry’s a whole other world to explore. It’s flexible, malleable, all over the place.

But how does Parrish define poetry? At the beginning of the episode, Parrish cites Charles Hartman, who authored the 1996 book Virtual Muse. In Virtual Muse, Hartman recalls his experiences as he attempted to develop a poetry generation program. Hartman concluded that it seemed as though his program was making poetry, but wasn’t making poems. Parrish runs with this, elaborating that:

Poetry is just something that sort of exists from one line to the next. It’s a feeling that you get from a stretch of language. But a poem is something that has internal structure, and there areelements that have to come together with a particular hierarchy. It has to have a beginning, and an end, and a middle, and stuff like that. And the stuff that I do generally doesn’t have those properties. But a book, of course, has to have a beginning, a middle, and the end, because it’s a physical object. So with the book [Articulations] I had to figure out how to make something that starts and ends, and I basically just ended up punting on that problem.

I especially like this explanation because, not only does Parrish actually provide her working definition of a poem, she also makes clear her own assumptions about what books are, and what texts should include. This quotation is really all about how Parrish is trying to negotiate cultural expectations related to literary convention (in this case, the conventions of poetry) and her own work with poetry generation. While her work definitely exploits the novelty factor of computer-generated texts, for Articulations Parrish is forced to rein herself in and adhere to more standardized formats of text presentation. She needs a beginning, a middle, and an end, even though her programs will keep generating texts indefinitely.

The episode actually does go on to discuss some of the limitations of current text- and book-based assumptions. Parrish explains why she agreed to publish her digital-born poetry in a more traditional codex form. There are a few reasons. I’ll block-quote these ones, because they’re super important.

One is just related to the different kinds of reading that different media encourage. Looking at something on your laptop screen, or looking at something on your phone, or looking at something on your Kindle – we want to think that that experience of reading is just completely analogous to the process of reading it on a piece of paper, in a codex format, but it just isn’t. So much about where you can take it, what you can do with it, how it feels – the modality is just different, and there are particular kinds of readings that you just can’t do in other media. That’s something that gets totally lost, and one of the reasons it gets totally lost is that companies want us to be paying attention to our glowing rectangles, because that is visual real estate that they control constantly and can change whenever they want. That’s, for me, why a huge reason why a book is an amazing thing, because I’m never going to get a notification pop up inside of a book. There’s never going to be an advertisement inside the book that has been tailored for me, in that moment…

There’s also something about the polishing process, which actually ends up being weirdly… there’s something almost more democratic about the process of publishing on paper because the process of who is making these decisions and who is getting the attention isn’t necessarily tied in as much with these other systems of power that are projected very, very easily in a digital format…

There’s also the fact that people just like to have things.

Parrish really nicely articulates the effects of media on reading experience. We approach the analogue and the digital rather differently, largely because of our expectations of what these media are to be used for and how they are to be interpreted. These expectation themselves are shaped by ever-changing cultural narratives technological devleopment. Whether we should embrace it. Whether we should fear it.

Parrish also goes on to discuss the ‘threat of permanence’ afford by a traditional print book: in our digital worlds, our interactions with texts are often so fleeting as to seem completely ephemeral. However, as we’ve seen with some books that have survived, you know, hundreds of years, a print book is a cultural artifact that can stick around for quite some time. Language becomes a material object to be preserved, and a person’s words are no longer constrained by space and time. While Parrish uses the word ‘threat’, she doesn’t necessary seem to think permanence is a bad thing. It seems instead that she’s just thinking about how translating digital-born texts – regarded as so ephemeral – into print form can completely change their meaning, and how people interpret the texts in question. It’s all about those effects of media on reading experience. Different media encourage different kinds of reading.

So what about the poetry that continues to be written by humans? Parrish calls it ‘intention-typical poetry’. This is one of the parts of the podcast that I wish lasted longer, as the term isn’t defined in enough depth for me. But, from what I gather, ‘intention-typical poetry’ just refers to the understanding of a text as driven by human agency and communicative intention. There’s a message that someone wants to get across, and poetry is the medium used to do so. I guess, then, that ‘intention-typical’ could also be applied to prose. We also tend to assume that prose is the result of someone consciously trying to get some sort of message/story across to us. In this way, a text is a means for connecting humans, for sharing experiences, for congregation.

Indeed, the podcast’s interviewer articulates her own interpretation of computers as tools for manifesting human intention. She declares that ‘my enjoyment and attachment to a piece of art has to do with imagining the human mind, and the process of the person making that thing.’ Parrish responds to this by describing her interactions with poets and people in the humanities, many of whom respond to the notion of computer-generated texts with resistance. How does this technology affront current labour models? What about the feeling humans insert into their texts – what happens to that? Well, Parrish asks, what does ‘computer-generated’ even mean anyway? We all use procedural means in our writing. We follow genre-specific structures. We use computers during various stages of our writing processes. If I depend a bit too heavily on spellcheck, is Microsoft Word a co-author of my text? Parrish similarly considers computers as tools, though, and this becomes clear in her discussion about questions of output ownership. She notes that many of her computer-savvy colleagues working on sonnet generation describe their programs as being responsible for output. However, for Parrish, this kind of narrative negates the incredible amount of work that goes into producing the program that generates the work. Parrish assumes authorship of her computer-generated Articulations because she wants recognition for the time and effort she put in to creating a program that could produce output in line with her own creative vision. Thus, while Articulations is a computer-generated book, it remains profoundly human.

To be sure, things didn’t go entirely according to plan during Articulations‘ production. During the first print run, Parrish noticed offensive words and phrases that her program had included in its output, and that had gone unnoticed when the output had been sent to the printer. For this reason, the first print run of Articulations was pulped, and the offensive words and phrases removed so that the book could be reprinted in accordance with Parrish’s ethical requirements. This human intervention further supports the idea of Articulations as a human product. As a human product, someone is responsible for ensuring that the text is ethical. Unfortunately, the interview didn’t seem too interested in questions of ethics and didn’t delve further into the topic. Eugh.

This conversation with Allison Parrish was fascinating, and left me with so much to think about for my own research on the social implications of natural language generation, even if I’m more focused on prose. Parrish is articulate, insightful, and funny, and is well worth setting aside two hours of your day for.

Alternatively (or in addition), you can watch this video of Parrish discussing some of her work at the 2015 Eyeo Festival.

Meow Meow Meeeeeeoooooow: NaNoGenMo 2017

Every November since 1999, writers from around the world have observed National Novel Writing Month (NaNoWriMo). The challenge? Spend the month writing a novel of at least 50,000. This year alone, NaNoWriMo anticipates more than 400,000 participants. Past NaNoWriMo successes include Sara Gruen’s Water for Elephants, a 2006 bestseller that later became a movie featuring Robert Pattinson and Reese Witherspoon, and Erin Morgenstein’s 2011 The Night Circus, which won an Alex Award from the American Library Association in 2012.

The month of November, however, is not only celebrated by writers worldwide. During the same month, NaNoWriMo’s lesser-known and slightly deranged younger sibling also asserts itself. Proposed on a whim in 2013 by Internet artist Darius Kazemi, National Novel Generation Month (NaNoGenMo) has increasingly gained traction in the programming world. The challenge? Write a code that generates a novel of at least 50,000 words. ‘The “novel” is defined however you want’, Kazemi explains. ‘It could be 50,000 repetitions of the word “meow”. It could literally grab a random novel from Project Gutenberg. It doesn’t matter, as long as it’s 50k+ words’.

And before you ask – yes, someone has actually generated a series of novels comprising 50,000 repetitions of the word ‘meow’.

Computer generation of stories didn’t begin with NaNoGenMo. Indeed, new research reveals that the first known computer story generator was actually developed as early as the 1960s. Even earlier than that, around the 1840s, computer science pioneer Ada Lovelace was considering computational creativity, albeit in slightly different terms. In one popular quotation, for example, Lovelace warns about ‘exaggerated ideas that might arise as to the powers of the Analytical Engine [general-purpose computer]’ imagined by her mentor Charles Babbage:

The Analytical Engine has no pretensions whatever to originate any thing. It can do whatever we know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths. Its province is to assist us in making available what we are already acquainted with [italics original].

Stories produced for NaNoGenMo are the result of computer codes developed with their programmers’ intentions. These codes are usually highly specific: the generated text needs to adhere to at least some literary conventions so that it’s understandable and – if you’re lucky – readable.

As one commentator writes:

I think you’ll find that writing code that writes a book that is not-boring to read for the first few hundred words is not too difficult.

After those first few hundred words, though… well, all I can suggest is you download one of the completed novels (from this year or from any earlier year) and try to read the whole thing. The word “boring” does not quite do justice to the experience.

David Stark’s 2014 Moebius Octopus, for example, sounds promising enough: Stark describes the code as ‘mutating Moby Dick to be about sexy space amazons fighting octopodes through a word mapping.’ The generated output, though, quickly loses its novelty when once actually sits down and starts reading the convoluted text.

Some NaNoGenMo texts, though, are intended to be skimmed. One of the most well-known NaNoGenMo submissions, Nick Montfort’s 2013 World Clock, recounts fictional events from around the world for each minute of a day. One section reads: ‘It is now almost 22:38 in Catamarca. In some homey dwelling a youth named Ephrem, who is rather large, reads a embossed certificate. He chews a fingernail.’ The text continues in this way. And, if you read it from beginning to end, it gets very boring, very quickly. Flip to any of World Clock’s pages and read a snippet, however, and it can actually be pretty thought-provoking.

Indeed, NaNoGenMo submissions are often better appreciated for the ideas driving generation, rather than the generated output itself. One could consider NaNoGenMo as an opportunity for programmers to explore conceptual literature by making their own. For example, a 2015 submission by Duncan Regan, The Cover of The Sun Also Rises, converts a photo Regan took of his copy of Hemingway’s The Sun Also Rises into a novel and audio book. The text begins: ‘Quartz. Davy’s grey. Purple taupe. Gray. Pale silver. Almond. Pastel gray. Pale silver. Ash grey. Ash grey. Manatee. Taupe gray. AuroMetalSaurus. Dark electric blue. Teal blue. Teal blue. Teal blue. Teal blue.’ 2014 saw the release of Greg Borenstein’s Generated Detective, a noir comic generator that pulls sentences from Project Gutenberg’s detective novel corpus and pairs them with a public domain image that is run through an application that transforms the photo into a comic book style. In 2016, NaNoGenMo included a ‘newspaper blackout poetry’ submission, the output of a program that scribbles out most of scanned pages save for select words that create new sentences. 2016 also saw NaNoGenMo as a university module assignment. All of this output restructures old text into new forms that allow us to consider what already exists with a fresh perspective. Channelling Ada Lovelace, one could argue that these NaNoGenMo codes have ‘no pretensions whatever to originate any thing… [their] province is to assist us in making available what we are already acquainted with.’

This small selection of NaNoGenMo submissions described over shows the diversity of NaNoGenMO submissions, but only represents a fraction of the ways in which computer storytelling systems can be applied. Some early posts about this year’s NaNoGenMo indicate that we’re in for an exciting month. We can expect a soap opera simulation, a Choose-Your-Own-Adventure novel, and a ‘Chatty Chess Engine’ that narrates in-depth analyses of the chessboard and potential moves, to list just a few.

Get excited, y’all. It looks like we’re for a weird ride come November. Join me in following the development of 2017 NaNoGenMo submissions here.