Monthly Archives: May 2016
The realistic depiction of light transport in a room is important within the production of computer-generated movies. If it does not work, the three-dimensional impression is rapidly lost. Hence, the movie industry’s digital light experts use special computing methods, requiring enormous computational power and therefore raising production costs.
Not only in the film industry, but also in the automobile industry, the companies invest to make lighting conditions for a computer generated image as realistic as possible. Already during the development process, entire computing centers are used to compute and display realistic pictures of the complex car models in real time. Only in this way, designers and engineers can evaluate the design and the product features in an early stage and optimize it during the planning phase. “They build hardly any real prototypes. Hence, the designers want to make sure that the car body on the screen looks exactly as the real vehicle will appear later,” explains Philipp Slusallek, professor of computer graphics at Saarland University, Scientific Director at the German Center for Artificial Intelligence (DFKI) and Director of Research at the Intel Visual Computing Institute at Saarland University.
With current computing methods, it has not been possible to compute all illumination effects in an efficient way. The so-called Monte Carlo Path Tracing could depict very well the direct light incidence on surfaces and the indirect illumination by reflecting light from surfaces in a room. But it does not work well for illumination around transparent objects, like semi-transparent shadows from glass objects, or illumination by specular surfaces (so-called caustics). This, on the other hand, was the advantage of the so-called photon mapping. But this method again led to disappointing results for direct lighting of surfaces. But since these two approaches were mathematically incompatible (Monte Carlo integration versus density estimation), it was not possible to merge them, and therefore it was necessary to compute them separately from each other for the particular images. This raised the computation costs for computer-animated movies like “The Hobbit: An Unexpected Journey”, where up to 48 pictures per second have to be computed—for a movie whose “normal” version is 169 minutes long.
In cooperation with Ilyan Georgiev, PhD student at the Graduate School for Computer Science in Saarbrücken, Jaroslav Krivanek from the Charles University in Prague and Thomas Davidovic from the Intel Visual Computing Institute at Saarland University, Slusallek developed a mathematical approach in 2012 that combines both methods with each other in a clever way. They reformulated photon mapping as a Monte Carlo process. Hence, they could integrate it directly into the Monte Carlo Path Tracing method. For every pixel of the image the new algorithm decides automatically, via so-called multiple importance sampling, which of both strategies is suited best to compute the illumination at that spot.
The researchers from Saarbrücken also supplied mathematical proof that the results of the new computing method comply with those of the two former methods. “Our new method vastly simplifies and speeds up the whole calculating process,” says Slusallek.
The method “Vertex Connection and Merging'” abbreviated as VCM, was not only accepted at one of the most important conferences within the computer graphics research field ― SIGGRAPH ― in 2012, but was also very well received by industry. “We know of four different companies that partially integrated VCM in their commercial products only a few months after the scientific publication. The most recent example is the new version of the software Renderman developed by the company Pixar. For decades this has been the most important tool in the movie industry. We are very proud of this achievement,” Slusallek says. The Californian (US) company Pixar, famous for movies like “Toy Story,” “Up,” “Finding Nemo,” and “Monsters, Inc.” is part of the Walt Disney Company. Pixar originally got its name from Apple founder Steve Jobs. Up to now, Pixar has received twelve Oscars for its movies.
The technique used to fabricate the microfish provides numerous improvements over other methods traditionally employed to create microrobots with various locomotion mechanisms, such as microjet engines, microdrillers and microrockets. Most of these microrobots are incapable of performing more sophisticated tasks because they feature simple designs—such as spherical or cylindrical structures—and are made of homogeneous inorganic materials. In this new study, researchers demonstrated a simple way to create more complex microrobots.
The research, led by Professors Shaochen Chen and Joseph Wang of the NanoEngineering Department at the UC San Diego, was published in the Aug. 12 issue of the journal Advanced Materials.
By combining Chen’s 3D printing technology with Wang’s expertise in microrobots, the team was able to custom-build microfish that can do more than simply swim around when placed in a solution containing hydrogen peroxide. Nanoengineers were able to easily add functional nanoparticles into certain parts of the microfish bodies. They installed platinum nanoparticles in the tails, which react with hydrogen peroxide to propel the microfish forward, and magnetic iron oxide nanoparticles in the heads, which allowed them to be steered with magnets.
“We have developed an entirely new method to engineer nature-inspired microscopic swimmers that have complex geometric structures and are smaller than the width of a human hair. With this method, we can easily integrate different functions inside these tiny robotic swimmers for a broad spectrum of applications,” said the co-first author Wei Zhu, a nanoengineering Ph.D. student in Chen’s research group at the Jacobs School of Engineering at UC San Diego.
Google’s DeepMind brought us artificial intelligence systems that can play Atari classics and the complex game of Go as well as — no, better than — humans. Now, the artificial intelligence research firm is at it again. This time, its machines are getting really good at sounding like humans.
DeepMind unveiled WaveNet, an artificial intelligence system that the company says outperforms existing text-to-speech technologies by 50 percent. WaveNet learns from raw audio files and then produces digital sound waves that resemble those produced by the human voice, which is an entirely different approach. The result is more natural, smoother sounding speech, but that’s not all. Because WaveNet works with raw audio waveforms, it can model any voice, in any language. WaveNet can even model music.
Someday, man and machine will routinely strike up conversations with each other. We’re not there yet, but natural language processing is a scorching hot area of AI research — Amazon, Apple, Google and Microsoft are all in pursuit of savvy digital assistants that can verbally help us interact with our devices. Right now, computers are pretty good listeners, because deep learning algorithms have taken speech recognition to a new level. But computers still aren’t very good speakers. Most text-to-speech systems are still based on concatenative TTS — basically, cobbling words together from a massive database of sound fragments. Other systems form a voice electronically, based on rules about how letter combinations are pronounced. Both approaches yield rather robot-y sounding voices. WaveNet is different.
Flexing Those Computing Muscles
WaveNet is an artificial neural network, that, at least on paper, resembles the architecture of the human brain. Data inputs flow through layers of interconnected nodes — the “neurons” — to produce an output. This allows computers to process mountains of data, and recognize patterns that would perhaps take humans a lifetime to uncover. To model speech, WaveNet was fed real waveforms of English and Mandarin speech. These waveforms are loaded with data points, roughly 16,000 to sample per second, and WaveNet digests them all.
To then generate speech, it assembles an audio wave sample-by-sample, using statistics to predict which sample to use next. It’s like assembling words a millisecond of sound at a time. DeepMind researchers then refine these results by adding linguistic rules and suggestions to the model. Without these rules, WaveNet produces dialogue that sounds like it’s lifted from The Sim.
The technique requires a ton of computing power, but the results are pretty good — WaveNet even generates non-speech sounds like breaths and mouth movements. In blind tests, human English and Mandarin speakers said WaveNet sounded more natural than any of Google’s existing text-to-speech programs. However, it still trailed behind actual human speech. The DeepMind team published a paper detailing their results. Because this technique is so computationally expensive, we probably won’t see this in devices immediately, according to Bloomberg’s Jeremy Kahn. Still, the future of man-machine conversation sounds pretty good.