Artificial intelligence creates sound effects for silent videos that fool humans

[Read the post]

1 Like

This technology could save me a ton of foley work on my upcoming film “Drumsticks tapping on things”.

9 Likes

Fix it in editing. That’s what we did for “Drumsticks: Offscreen!”

1 Like

I wonder to what extent the neural network is associating sounds with the material itself (its texture and movement), and how much it’s purely learning about the motion of the drumstick. The latter would be less impressive, though still cool; if you think about it, with a sufficiently high frame rate you could simply measure the sound by seeing the vibrations through the stick, with no AI needed. Like that kind of remote audio bug that works by shining a laser onto a window or similar rigid surface.

Either way, there’s something neat about retrieving information that was never directly recorded. It’d be cool if it could get good enough to, say, extract on-set sounds from silent films.

3 Likes

My impression from watching was that it mainly had to do with stick motion, angle, etc. I’ve played the drums for 30+ years, and just about every clip of the “predictive” sounds had ones in it that seemed entirely wrong to me. (Some laughably so.) I wonder if the test where they asked humans to decided which video was real audio, and which was predictive is available somewhere.

Nowhere is Poetry so actual as in foley work. Metaphor and simile are reality:

This cinder block being dragged along another cinder block IS an ancient monolith’s hidden door opening unexpectedly.

These empty shoes crushing gravel are LIKE a person walking down a road.

When I was a kid playing around with tape recorders, I used an accordion to suggest an automobile accident. It worked.

5 Likes

About 9 seconds in, I expected to hear Pink Floyd.

So we’ll soon have a simulation of a watermelon being axed simulating a human getting axed. Progress!

1 Like

Let’s see it try to keep up with this rocker.

6 Likes

If I understand correctly, the replacement sounds are not “created” or synthesized, just picked from a database of actual sounds of drumsticks hitting things. While still impressive, I find it somewhat less so than the headline implied.

This topic was automatically closed after 5 days. New replies are no longer allowed.