What If is back! Well, that’s not really true, the website What If never left, but the RSS feed broke for a while. And since it’s a site with about a post a month, without RSS I forget about it. (if you don’t know what RSS is, read this). The RSS feed is back working and the site is as delightful as always.
If you don’t know what What If is, it’s a site by the same guy who does xkcd, described as Serious Scientific Answers to Absurd Hypothetical Questions. The latest post, Earth-Moon Fire Pole, is a perfect example, with a question from a five year old (kid questions are great ones):
My son (5y) asked me today: If there were a kind of a fireman’s pole from the Moon down to the Earth, how long would it take to slide all the way from the Moon to the Earth?
In typical What If fashion, he first explains why such a pole is impossible:
But then proceeds to say:
But let’s ignore those problems! What if we had a magical pole that dangled from the Moon down to just above the Earth’s surface, expanding and contracting so it never quite touched the ground? How long would it take to slide down from the Moon?
and follows with a lengthy discussion of the physics. From the Moon you’d initially be sliding “up” so you have to start by climbing, but gravity is low so that’s possible. But it’s really far before Earth’s gravity takes over so you can start to slide down. And then you start sliding crazy fast, etc. etc. Crazy fun and scientific all at the same time.
Now I have to go back and read all the posts I missed…
Having computers read things and sound like people has long been a challenge. But a Google team seems to have hit parity with human speech (paper):
The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize time-domain waveforms from those spectrograms.
Sure, perfectly clear to me. But they follow with:
Our model achieves a mean opinion score (MOS) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech
Comparable to professionally recorded speech, that I understand. Tacotron 2 is a single neural network trained from data alone. That seems to be the direction AI is going, neural networks that are easily trained.
What starts out as a way to relax eventually turns into a growth hack, a way to improve efficiencies and obviously talk about it on the social media. This mindset is pervasive. This is who we/they are. There isn’t an off-switch and basically, despite best efforts to relax, there are hardly who know how to relax. The obsessiveness in many ways is what which makes Silicon Valley people successful in their day job.
Do you really think we could have had cars-on-demand if someone wasn’t obsessed with hacking “taxi industry” and “limousines” because they had to wait for a cab too long in Paris?