Archive for the 'Giant Stories, TIny Screens' Category

Meet Eliza, the Flashiest Phone Bot Around!

Eliza sits at her desk in her office. She completes ordinary office tasks—she checks her email, she drinks her coffee, she gets up to go photocopy something or talk to a colleague, and once in a while she checks out the New York Times. Little does she know, she’s being livestreamed to the whole world over the web. If someone calls, she picks up. Sometimes she recognizes the caller, sometimes she does not, and sometimes the connection is so bad that she hangs up and calls back.

Eliza lives on a screen in an eddy of a high-trafficked area, say an out-of-the-way elevator lobby in a busy building. A user sees her and after a couple of minutes, his curiosity gets the best of him and he succumbs to the flashing invitation and calls. To his surprise, after a couple of rings Eliza picks up. Phone conversations are ritualized in the first place and the added awkwardness of voyeurism and conversing with a stranger create the ideal situation for Eliza’s black-belt phone jujitsu which with minimal effort wrests control of the conversation from her interlocutor. It’s a bit like a good dancer foxtrotting and waltzing an overwhelmed novice around the floor.

The prototype is rough, but it works, though because of Flash’s arcane and draconian cross-domain security measures, I can only run it locally through the Flash IDE or stream from my machine using a personal broadcasting service like ustream or livestream (in order for it to work properly on the web, I’d have to host all the components I enumerate below on a single box, something I have neither the hardware nor the inclination to do). The main problem is that I’m making XML socket connections from Flash; if I used a PHP intermediary, I could probably get it working, but again, the whole inclination thing is missing and the thing is already mindbogglingly complicated. Maybe at some point in the future. The following video demonstration will have to do in the meantime.


Warning: this is not for the faint of heart.

Eliza has a ton of moving parts:

  1. The Asterisk script: A simple program that answers phone calls and hands them to a PHP script, which connects via socket to the main SWF.
  2. Various PHP scripts: One to handle connections from Asterisk, one to reset connections from Asterisk after a call ends, and one to initiate callbacks when required.
  3. A simple Java socket server: Adapted from Dan Shiffman’s example, this program runs in the background on the Asterisk server, waiting for connections (phone calls). When a call comes in, it connects it and broadcasts call events (new call, hangup, button press, etc) to the PHP scripts and the main SWF and allows them to talk to each other.
  4. The main SWF: This is the brains of the operation. It loads the movies of Eliza and controls the logic of their looping as well as the logic of the audio (via socket connection back to PHP and then to Asterisk via AGI).
  5. The looping movie files (not completely smooth in this prototype, notice the moving phone and the changing light conditions!): These live in the same directory as the main SWF, which streams them as needed (for a web deployment, they’d probably have to be pre-loaded and played back).
  6. The sound files: These live on the Asterisk box (completely separate from the movies) and are played back over the phone, not the web.

UPDATE: I’m presenting Eliza at Astricon in DC in October, so I should have some interesting observations to report soon. There are several things I’d really like to do. First, I’d like to actually get this working somewhere where I can observe lots of unsuspecting folks interacting with Eliza. I never really got to see someone who didn’t know the backstory calling in, partly because I was exhausted from thesis when I had the chance to show it and partly because there were lingering bugs I had not yet located that occasionally caused the whole thing to stop working—there are so many things on so many separate machines that can go wrong, it took quite a while to troubleshoot. A larger sample of reactions would allow me to rework the conversations so that they’re more disorientingly convincing—better pause timing, more realistically intoned, and taking into account repeat callers’ stratagem’s to see if Eliza is real. I could then reshoot the video so it is completely seamless. That would require monitors, good lighting, laser levels, an OCD continuity editor, and several days of shooting.

If you know of an easy way to overcome the cross-domain headaches, leave me a comment! If you want to fund such an undertaking, please do get in touch! Otherwise, enjoy the video above.

As the world Turings…

For two years I’ve flirtatiously circled the Pygmalion myth, toying with human-machine interactions in which it’s not necessarily clear to the human that s/he’s interacting with a machine or human-human interactions in which both participants are convinced that the other is a machine. I can’t seem to get away from this idea of tricking people into adopting mistaken mental models of interactions. I thought it would be fun to create two bots that would follow each other on Twitter. Caleb Larsen, whom I’ve written about before and with whom I’m beginning to believe I share an eerie and otherworldly mental connection (I found this today, compare to my Obama piece) created a script that updated his Facebook and tweeted randomly generated status messages as part of Whose Life is it Anyway, though in the end he abandoned algorithmically generated messages for appropriation of other people’s statuses—which I find conceptually stronger but no longer relevant to the topic at hand.

In any case, in 1950, Alan Turing wrote a paper about thinking machines. In it he proposed a thought experiment in which a person is asked to converse via teletype with a person and computer pretending to be a person. If s/he is unable to definitively distinguish between the two, goes his argument, the computer is effectively intelligent. People have taken issue with Turing’s conception of intelligence, but nonetheless, over the years, this “Turing test” has spawned doctoral dissertations, colloquia, academic prizes, late-night geek-outs, and many software implementations of computerized interlocutors or “chatterbots.” The first of note was Joseph Weizenbaum’s ELIZA, a Rogerian psychotherapist (you can still talk to her here). She was followed by PARRY, a paranoid schizophrenic. A match made in heaven, I know, but their conversations weren’t nearly as interesting as this exchange between more sophisticated later chatterbots (in this case ALICE and Jabberwacky). Awesome.

There have been a couple of really good ITP projects that riff on what I might call the nebulous interlocutor. Generative Social Networking is my absolute favorite ITP project—in conception, in execution, even in documentation. After using a Bluetooth exploit to download all the contacts on your cellphone, a program calls each number in succession, playing a recording of the last person it called as the other half of the conversation. The most amazing thing when you listen to the demo is the realization that most people have no idea they’re talking to a recording! And some of the “conversations” that develop would easily fool a casual observer too. The ritualized form of phone conversation combined with the latencies and poor connections to which frequent cell phone use have accustomed us make it really hard to tell the difference.

That was part of what made the Popularity Dialer so much fun (and ultimately led to its demise—though creators JennyLC and Cory referred to it last week as “dormant” rather than dead). The premise popular people get lots of phone calls, so what better way to enhance your popularity than by increasing the number of calls you receive? Enter your phone number on a website and schedule a call from one of five characters who think you’re awesome (girl dying to date you being my favorite). At the appointed time, your phone rings and the voice you’ve selected speaks its half of a recorded phone conversation, pausing several beats for you to respond. It seems totally real to onlookers. The problem? It seems totally real to many of the people receiving the calls after their friends entered their number as a prank. Worked great until a humorless FCC lawyer got a call late one night from dude wanting to get some beers.

I love both of these projects. They raise questions about the subjective nature of interaction that don’t get discussed all that much in the literature. So much of an interaction is in our heads. That’s the great lesson of Apple’s marketing—you can take a shitty phone that’s uncomfortable to hold and inconvenient to talk on, but if people are emotionally attached to it, they’ll find using it a pleasure anyway (I think Donald Norman might have said something similar a little more eloquently). In our case, if I think I’m talking to a real person, my experience of that conversation will be radically different from my experience of the exact same conversation if I know I’m talking to a recording—just think of that weird, disjointed feeling you get when a friend’s answering machine picks up and you think it’s him and start talking only to realize a second later that it’s a recording. Richard Powers’s Galatea 2.2 deals with this notion of artificial intelligence as deception, and I want to as well.

I propose a film of a woman with her back to the viewer. She is obviously concentrating hard, occasionally tapping a pencil or reaching for her coffee mug but otherwise moving very little. A phone number is displayed beneath the frame. The viewer calls the number and suddenly the phone on the desk next to the woman rings. She picks it up, and the viewer is amazed to hear her voice both on the screen and through his phone. He speaks to her. She responds that the connection is not clear, she can’t hear him well. He tries to gauge whether she is a real video or a clever program. She hangs up in anger and frustration. She looks at her phone and decides to call back. The viewer’s phone rings and when he picks up, she apologizes for the poor connection and asks him a question. When he answers, she asks another. Suddenly, she has to go. She apologizes, turns toward the screen, waves, and hangs up. The viewer scratches his head and calls back. Her phone rings, she looks at the number and sends the caller directly to voicemail with an over-the-shoulder wag of the finger. And scene!

I’ve seen a couple implementations of phone-enabled interactive movies, but they’re infantile choose-your-own-adventure narratives constructed like corporate phone trees (“if you’d like to see the hero die, please press pound now, otherwise, stay on the line for more options”). I want the interaction to be the purpose of the piece, not a means of advancing a canned story, though I do love the bizarro preview man voiceover in this German interactive “horrah” film:

My system works in a similar way, though without all the voice recognition. I’m interested in exploring how much of such an interaction is actually reactive. In Japan, for instance, it’s definitely over 50%, but I’m working on the assumption that it will be similar for the viewer speaking on the phone, that the character in my movie won’t need to respond directly to the viewer’s words because the social inertia that carries people through uncomfortable party conversations with socially maladroit companions will cause him to behave a certain way in this particular interaction—enough that I’ll be able to maintain some doubt as to whether they’re actually participating in a real conversation. Based on several recent interactions with customer service representatives over the phone, I can’t swear that health insurance companies haven’t already commercialized and adopted this system.


Plentiful bandwidth, virtually free storage, and internet connected cameras has translated into a glut of online video. When anyone can upload to the online panopticon, it’s only a matter of time before people start exploiting the web’s massive audience to crowdsource moonwalks, personal interpretations of the Mos Eisley Cantina scene, ads, or homemade porn—for fun and for profit.

Well, guess what? I don’t want to see your videos. Not the ones you’ve uploaded at least.

The proliferation of cameras everywhere makes it less and less likely that you are ever not being recorded and uploaded the minute you do something remotely interesting. See, for instance, Hong Kong Bus Uncle, the infamous “don’t tase me, bro” (which I find so distasteful that I refuse to link to it), Chinese Airport Woman, and el niñato de Valencia. But again, these are actions performed in public—the operating assumption has to be that someone is recording. And with sites that make live broadcasting as easy as hitting a button on your phone (UStream for instance) popping up like nefarious little mushrooms, it’s entirely possible that your public meltdown will be captured and transmitted live and from several different angles. Totally unscripted reality TV, it’s like your real life, only more interesting.

But not to me. I’m more interested, at least for the purposes of this argument, in recording deviously, either in secret or with unacknowledged intentions. At some point in the future, it’s conceivable to imagine that there will be no place where one is legally protected from being filmed and/or photographed. Or when there are just so many people and devices filming and uploading so many things that prosecuting them all will be impossible, which is functionally equivalent. It is from said future that the ideas that follow come.

What if I created an iPhone app that requires you to hold the device up to your ear as if you were talking on the phone (or when you’re actually talking on a phone with an open source platform) entirely as a pretense to upload video the camera on the back of the phone is recording without your knowledge. There would probably be a lot of hands in the way, but that would make it easier to filter through the results in software. You’d never be in the video so it would be hard to definitively identify it as yours.

A slightly more elaborate variation on that theme would be to build cameras into other devices. One of the big payoffs for me of the Eternal Moonwalk mentioned above is that the majority of people tend to moonwalk across their living rooms, so you get to see the insides of people’s homes all over the world. What if everyone who bought a Roomba were unwittingly inviting an autonomous, wireless streaming surveillance camera into their home? The easiest way I can think of doing this is embedding cameras into particularly nice pieces of furniture left out on New York City sidewalks.

Page scraping and iframes offer another interesting alternative video source which might actually be much less illegal since technically you’re not moving the video from its original location. Instead, you’re finding video content, preferably unembeddable proprietary stuff, and using a web script to strip away any surrounding material and reproduce it in a different place—and it never moves from its original location.

My favorite approach, though, is simply to lie about your intentions. It might be as simple as creating a video high score board for an online game, where instead of their initials, people leave a ten-second taunt for the players they’ve just displaced. A database filled with video taunts has many potential uses. It might be more complicated, for instance creating an online application that uses face detection to perform some non-camera-related function—shaking your head to pan an image back and forth for instance—so that when the application requests access to the user’s web camera, he thinks nothing of pressing “OK,” never suspecting that his face is being displayed on a billboard somewhere across the globe with the supertitle “Did you know that 1 in 3 people has genital herpes?”

Or, as I discovered in the process of writing this post, offer some sort of online video conversion. Video formats are confusing as hell. Put up an all-in-one converter, make it look slick, and simply “keep a backup copy” of people’s video when people upload it!

The Sound of White Space or Pregnant Pause Parturition

As anyone who’s tried to write fiction knows, the real hurdle is not deciding what to write, it’s deciding what not to write. The empty page, like the empty score or the empty canvas, is white not because there is nothing on it but because everything’s on it—possibility, like light, is additive. The act of putting a word on a page, a note in the air, a drop of paint on the canvas removes a bit of that possibility, revealing a glimpse of what it may actually become. I know I’ve read somewhere that a block of uncarved stone contains every sculpture and that it’s only by chipping and chiseling that an artist collapses the artistic wave function into a singular reality. The result is the interface between the remaining possibility (positive space) and the absence of what has been removed (negative space).

Sometimes, though, that interface is deceptive. Foams, for instance, have voluminous contours, but pressure or heat or vigorous movement reveal their deception, reducing them to a mere puddle. Impermeability between negative and positive spaces in a work of art may well correlate to its quality and perdurance, I put that out there to the turtlenecked Barthes and Foucault-reading crowd to discuss. In any case, designers who find commercial success of the Architectural Digest sort are great fans of drawing rigid lines of Germanic severity to divide what’s there from what’s not.

While I find this hard-edged contrast alienating in architectural spaces (I much prefer the worn edges and threadbare plush of vernacular utilitarianism), I all but require it linguistically. There is no place for diaphanous prose in my bookshelf, nor will I fight you for tickets to any recent mainstream movie. Blurring the boundary between meaning and nonsense purposefully is either comedy or chicanery; accidentally, it’s the sign of mental rot.

I find political discourse in general and American political discourse in particular a perfect example of this foamy, insubstantial nonsense, a populist pastiche of pre-chewed jingoistic pablum that fills heads with bubbles that quickly deliquesce to nothing. I was curious, in exploring notions of positive and negative space, to discover how this discourse is actually constructed.

To this end, I took this year’s State of the Union address, all 69 minutes of it, and reduced it to its negative space, cutting out all of President Obama’s utterances. What remains is a strangely compelling silent dance between the beats. How we don’t speak is as idiosyncratic as how we do. In Obama’s case, the space around his words is punctuated with generous pauses and a constantly turning head (though one suspects this may have more to do with the dual teleprompters than with his oratorical style). His hands are animated while he speaks, but drop with a thud against the lectern as he pauses, fingers interlocked.

The rhythm of his silences follows the rhythm of his speech. Even without hearing a word, we can tell by watching the crowd how his rhetoric moves between introductory remarks, political self-congratulation, and exhortation, before ending on a note of overwrought patriotism. The accompanying silence in the room is mesmerizing, both for its depth and its duration. Here is a man who can hold an audience for over an hour, and, I’d argue, comes pretty close to holding an audience for half an hour without saying a damn thing.

How much of the negative space that we experience do we throw away as soon as we can see where the positive space begins and what do we lose in the process? I think it depends on what it is we’re experiencing, but my guess is regardless, it’s more than we think.


This week, a film canister with a roll of paper was hidden somewhere in New York, its approximate location posted on a Google Map along with a video with extra clues. My clue (no. 5) is on the corner of Broadway and 8th and was filmed on a Sony Ericsson G705. Double click on the YouTube clips to see them in their own window.

[UPDATE 2014] It appears Google has changed how one embeds media into maps, so the videos are no longer showing up when you click on the place markers. Dan Liss owns the map I believe, so it will remain blank until he does—indefinitely I’m guessing.

At the risk of sounding arbitrary…

Made with the Casio Exilim still camera we have on the Floor (which Final Cut just doesn’t seem to like) (which I have since reformatted).


This week’s cinematic challenge was to create the visual equivalent of Hemingway’s terse but complete “For Sale: Baby shoes, never worn,” a microcinematic three-shot story. I composed and shot a story about putting an egg carton with only two eggs left in it in the fridge and opening the next morning to find the two eggs snuggling cozily in adjacent spaces, surrounded by half a dozen quail eggs. Then I read Robert Bresson’s Notes sur le Cinématographe, and it made me think that maybe I shouldn’t force the egg to tell a story but rather to capture the story inherent in the egg.

How long does a minute take?

The Lumiere brothers made evocative and expressive films without camera movement, zooms, sound, or more than about 60 seconds worth of film. To learn how to see again and tell stories again, maybe we should too—at least according to Andreas Haugstrup Pedersen and Brittany Shoot’s Lumiere Manifesto. I am of the mind that their collection of one-minute silent, unedited, and effects-less films requires no such justification, but I’ll answer their call to arms, even if it does smack a little of filming floating plastic bags a la American Beauty.

There are beginnings, middles, and ends throughout my morning routine. They are predictable and irreversible and progress along parallel trajectories. Watching them closely makes me hate mornings even more.