The "numbers station" of Youtube?


#1

The YouTube channel for the user “Webdriver Torso” contains over 77,000 videos, each 11 seconds in length, with a series of one second pitches, each accompanied by a frame containing nothing but one blue and one red rectangle on a white background. No one seems to have any idea where this channel came from, who the user is, or what the purpose of the videos might be. “Webdriver” is the name of a product in the Selenium suite of browser automation tools (for instance, used to test performance and stability of a web application), and it’s plausible that this is the very tool used to automate the uploading of the videos to YouTube.

This is begging for an analysis of the data represented in these videos. For anyone fascinated by numbers stations but frustrated that they missed the heyday of the Cold War, this might just be your chance! I’m a developer, but this falls well outside my areas of expertise…but I’d be happy to try to cooperate with anyone interested.


#2

cool. bumping thread for interest


#3

Maybe get this person to help?


#4

You know, more and more I’m seeing smart Aussies doing clever tech stuff. It’s amazing - feels like everywhere I look!


#5

[Permalink]


#6

It’s probably some guy with 100 cats in a trailer trying to communicate with aliens.


#7

Did I just get nam-shubbed?


#8

The article was close, I bet it’s some kind of automated testing thing- some company (possibly even YouTube itself) makes a product that uploads video to YouTube, and they need an automated way to test it, so a short video that can be programmatically verified as being accurately reproduced is created and uploaded and then tested.


#9

But it still leaves me wondering why they need to do that 77,000 times in a year. That’s 200ish times a day, right? Seems like a lot of quality control.

That said, I want this to be fun and slightly nefarious and a solvable thing, so I’m all about disregarding the simplest answer.


#10

With that many video it must be official, right? Otherwise the auto-bots would ping it as spam.


#11

Well, my theory was that it might run every time any developer runs a certain test suite, which could be several times a day per developer. With a large team, 200 times/day wouldn’t be too unreasonable.

I used the Google API to list the first 45,000 videos, and put the info in a spreadsheet. I expected to find fewer videos uploaded on Saturday and Sunday; but it turns out there’s no day of the week that has significantly fewer uploads.

I still think it’s for testing something rather than some attempt to communicate.

EDIT: no pattern in hour of day either.

EDIT2: Just realized “Torso” might be related to the Tor project?


#12

<Step-up-Transformers joke>


#13

I suppose it’s some software testing, considering webdriver is a part of selenium which is automated software testing: http://docs.seleniumhq.org/projects/webdriver/

“The biggest change in Selenium recently has been the inclusion of the WebDriver API. Driving a browser natively as a user would either locally or on a remote machine using the Selenium Server it marks a leap forward in terms of browser automation.”

“Selenium automates browsers. That’s it! What you do with that power is entirely up to you. Primarily, it is for automating web applications for testing purposes”


#14

this is in the third sentence of the article


#15

Anyone have an idea about how to extract and quantify the tones and/or rectangles? If we could have those reduced to meaningful data, then we could look for patterns within each dataset or relationships between them.

EDIT: wow, just realized this article got bumped to the front page. Neat! But now I feel obligated to work on this puzzle instead of hoping someone smarter than me does it first.


#16

The tones could be quantified by key or frequency, or, depending on how many different tones are used, they could correspond to an alphabet or hexadecimal values.
The rectangles hold a fair amount of data. There are two of them, so lets look at possible differences,
1- Are they touching, yes/no
2- Which is bigger, red/blue
3- If they are touching, which is on top, red/blue
4- Which is longer, red/blue
5- Which is taller, red/blue
6- Which is placed higher, red/blue
7- Which is placed further left, red/blue
8- Is one inside the other, yes/no

I might be missing some things, but from the boxes alone, we can get 11 octets of binary data.
I have no idea what order these would be in.
Another thought, an octet represented a portion of an IP address or an ASCII value?
Yet another thought, what if this is just Valve preparing to announce Portal 3? Remember what they did before Portal 2?


#17

then it would be cyan and orange ovals

I guess there’s two sets of coordinates for each rectangle (X1,Y1,X2,Y2) and the pitch of the note, so about 6 bytes per frame, slightly less since the video is only about 240p and the pitch will only be ~20~16kHz. The video is most likely just a random number generator fed into a routine that draws two rectangles and plays a note.

if the intention was to send a secret message, why not either a) increase the bandwidth by using QR codes, or b) increase the steganography and hide the fact that there is a message at all?

Who wants this kind of thing? A botnet? A developer? Maybe flag the videos as copyright infringement and see who protests the DCMA takedown notice


#18

There’s nothing in your message and it’s freaking me out.


#19

Well, we’re just gonna have two topics then, since merge failed. @beschizza we must be getting the UI wrong with this if it’s messing you up.

How I would do it:

  1. Start here
  2. Click admin wrench at top right (oh you aren’t an admin here HAHAHA SUCKS TO BE YOU)
  3. Select posts
  4. Select all
  5. Move to existing topic
  6. (this might be the tricky bit?) need the topic name or ID or URL of the target topic. Search does work here but if you have two topics with the same name already it’s gonna be possibly confusing.
  7. Move 'em on, head 'em up, rawhide!

#20

Thanks! I’ll be sure to follow these instructions in future. The topic ID isn’t too hard to find from URLs.

Eventually, it would be neat to be able to specify in Wordpress the ID of an existing discourse thread, as an alternative to starting a new one.