STORY

Employee Development

Realizing the world's first image sensor with built-in AI engine! Discover the stories behind the drive to create the world’s first of its kind

October 11, 2022

By established AI technology firmly, AI is expanding its applications to many aspects of our lives. However, in 2016, many ventures among others have long contemplated to realize some applications with AI for image recognition, but they faced a formidable challenge in commercializing the idea. Namely, a problem of where to place for the AI processing. Many came up with the idea of placing it on the cloud server, but this proved to be unrealistic because, locating the AI processing on the cloud, all images and videos had to be uploaded, the data volume of which would be considerable, and it would consume a lot of power.
Ryoji Eki of Sony Semiconductor Solutions Corporation(SSS) had another idea: to have an AI engine build into an image sensor. He presented a paper on an AI engine with the highest processing efficiency (small, fast and energy-efficient) at the 2021 ISSCC (International Solid-State Circuits Conference)*1 to showcase the world’s first ever intelligent vision sensor with built-in AI processing functionality, IMX500. Eki reveals exclusive stories of the development behind the successful commercialization and the secrets of creating something the world has never seen.

*1) An international conference for the semiconductor industry to showcase the latest technology

Eki Ryoji

Sony Semiconductor Solutions Corporation
System Solution Business Division

Eki completed his studies at the Graduate School of Hiroshima University in 2008 and joined the Sony Corporation’s Semiconductor Business Group (present SSS).
He has been part of the development of CMOS image sensors, notably with the commercialization of the back-illuminated and stacked CMOS image sensors for mobile applications, followed by the product planning, commercialization and business exploration of the intelligent vision sensor, which started in 2016. His research paper was accepted by the international conference ISSCC 2021, and he won the Sony Outstanding Engineer 2020. Currently, he pursues the development of AI models and AI learning environment for SSS’s Edge AI Platform AITRIOS™ as well as the planning for the next generation intelligent vision sensor.

Inspired by great potential in combining AI with image recognition through workshop in the USA

Intelligent vision sensor is a stacked CMOS image sensor with an AI engine integrated into its logic circuit, with the ability to AI-process captured images at once and only upload necessary data onto the cloud.
Why do we need a sensor that lets only selected data to be stored? It is because, with the trend of digital transformation in the background, an emphasis is placed on the efficiency enhancement in the analog efforts for DX by leveraging digital technology. In terms of image sensors, the crucial point is finding a way of processing captured images to create efficient data. If the sensor transmits unprocessed captured images, someone will subsequently need to verify them, creating an analog job. Whereas AI can scan the images and recognize human beings, empty spaces, etc., which helps to generate required information such as how many people have passed by, how many spaces are available, and so on. The data thus processed can be put to use immediately by its receiver. Also, the advantages of sending only the required data include smaller data size, lower power consumption, faster transmission speed, and addressing privacy concerns by not storing unneeded images. In the Italian city of Rome, there is a pioneering trial project*2 underway to apply the technology to a parking lot monitoring to recognize empty parking spaces and notifying availability. Another possible application under consideration is to monitor a room for occupants and, when there is none, automatically turn off the air conditioning system.

*2) Related link:A video of a smart city trial project in Rome, Italy, using the intelligent vision sensor IMX500- published

Eki embarked on the development of intelligent vision sensor during the summer of 2016. There was an enthusiastic mood about “the image recognition AI surpassing the human eye capacity” around that time, and his inspiration came from a 3-month workshop at the US sales companies in which he participated from spring 2016. His purpose for taking up this opportunity was to listen to customer feedback and develop products that better reflected their requests, which he considered as a next step to take in the course of his career in the image sensor development. Many companies in the USA, from small enterprises to major corporations, were organizing conventions and exhibitions, and a number of them featured image processing technology with AI capabilities. Having seen these exhibitions and after many communications with SSS’s American clients, Eki began to recognize potential in a combination between image recognition and AI. However, the technologies shown at those exhibitions involved, without exception, expensive GPU to run AI processing, which would cost hundreds of thousands of yen per unit upon commercialization. What is more, their power consumption and the data size they needed to handle were the bottleneck. With these challenges in mind, he continued to lend his ear to various people and eventually arrived at an idea: if AI is embedded in the sensor, it may open up a possibility to realize a smaller sensor at lower cost—a product that sells.

With AI in the midst of its evolution, the development proceeded with a vision to commercialize it in several years’ time

SSS’s image sensors have an advantage of discerning not only the visible light, but also the invisible light to sense depths and capture movements. “This advantage may place us on a competitive plane with the world’s leading tech companies.” Inspired by the potential of AI and image recognition, Eki set up a project by himself and formed a small team to start the development. With a help of a project planning expert, he developed product ideas based on the client feedback gained in the USA. His baseline concept was that a good product must offer the applications that leveraged what AI was excellent at—recognizing people and objects. With this in mind, he embarked on a product development journey with an unrelenting determination to realize low power consumption, small size and fast processing.
He was facing two major obstacles. One was the development of an AI engine. When he tried an existing model, it turned out to be far inferior to the anticipated specifications, throwing him into an impasse. But this did not crush Eki’s determination. He looked everywhere for any information on AI engines from internal and external sources. Then, information started to flow in. It is an aspect of SSS’s corporate culture that people are willing to lend a hand to someone who is on a new endeavor. Seeing Eki asking around for advice on the subject of AI engine, colleagues naturally became inclined to suggest someone who might have some ideas. Eventually, he came to a team from Sony Semiconductor Israel (SSI). It was a pure coincidence that members of the SSI team were also thinking of creating a project in relation to AI. So, the two parties found each other at the right time.

Another obstacle was to understand the value the final product would offer. Given the fast and ceaseless advancement of AI technology, it was extremely difficult to develop specific ideas about the product as to its structure, electrical signals, and advantages, looking several years ahead for the commercialization. What he did first was to formulate specifications expected of the sensor based on the vision he discussed with the corporate people while in the USA, and he pursued the product development, expanding possible scenarios to a case in which AI progressed much faster than anticipated.

Achieving the project completion with rock-solid determination despite a major setback

As a young student, Eki was into handball. He was a member of the prefectural handball team to play at the prefectural junior-high championship. At high school, his team went up to semi-final. Being direct is part of his nature when it comes to doing what he wants to do. At university, he found a new passion, snowboarding, and spent some winters working part-time at a skiing resort. The same is true when he needs to confront his personal problems. While staying in the USA for the workshop, he spent weekends out on a golf course in order to improve his command of English. He says, “The best way to learn is to mingle with the locals.” Turning up alone at the course, he would pair up with a stranger, with whom he would have to communicate in English. Learning by necessity was his choice to acquire practical language skills.
He seems to take his career in his stride with his forward-thinking attitude, but with the development of the intelligent vision sensor, there was in fact a major setback. It was the project’s first client, who pulled out at the last minute, just before it went to mass production. The problem was that the requirements of this client could have been met by other means than the intelligent vision sensor if technology was sufficiently advanced. Being aware of this, Eki was pushing the project forward while a concern was stuck in his mind, and the worst scenario became the reality. Since this client was the only client of the project, he was devastated, thinking that it was over for the development of intelligent vision sensor. It was his boss who helped him to rise again from the depth of despair. He said to Eki, “You started the project. Then, you must finish it.” He also offered a piece of advice: “You need to know why it didn’t sell. You must know the reasons for a success or failure and use the knowledge in your next project. Otherwise, this project can only be worth your ego and nothing else.” These words stirred up a renewed courage in Eki’s mind to complete the project no matter what, and he eventually made it to the realization of intelligent vision sensor. The experience has taught him always to consider crucial technological aspects for clients to make sure the product will be chosen.

The keywords are “design for everyone” and “talk to someone” when facing difficulties

From the day one at Sony, Eki always had the desire to be part of the efforts to create the world’s best products and unparalleled technology. He was fortunate to be in the development of the world's first back-illuminated CMOS image sensor for mobile applications as well as that of the stacked CMOS image sensor. Through these developments, he gained experience in various processes up to mass production and knowledge in image sensors. The experience and expertise garnered him clients’ respect when he was partaking in the US workshop, enabling him to know their future visions for the businesses, which led to the intelligent vision sensor project. His approach to life—directness in tackling new challenges and value placed in communication—defines his personality. He keeps channels open to exchange information with the colleagues from other divisions and group companies he came across through past product development projects, including that of intelligent vision sensor. It is very important to be aware of the latest ideas entertained by development teams in different domains. These are valuable information to consider directions of future product development. Furthermore, he relays client feedback to these development teams and, by doing so, explores possible solutions that could be realized with the technology under development. While the intelligent vision sensor is a device for recognizing images, what clients really want to realize is often a “state” as an extended outcome, such as “reducing employees’ workload.” In order to deliver the product as a service, this cognitive gap must be bridged. It is necessary to offer a solution that is easy to understand, like “the sensor can reduce the workload of your employees by counting the number of people and providing this information to them.” Eki says, “The most influential factor that makes people hesitate in pursuing DX is often not knowing how digital technology can be leveraged. To move forward with DX, we need to develop devices that can fill this void, for example, by facilitating easy operation with a push button.” He is focused on providing the devices and services for preparing the environment for DX with ease and offering the solutions that only the intelligent vision sensor can deliver. His keyword is “design for everyone.”

We asked him where he found value in his work, to which he replied, “to see people satisfied in using the products which I worked on” and “to be able to say with confidence that I created these products.” These have ever been unchanged since he joined Sony, he says. The man’s approach to work is thus founded on his gratitude that he is doing things that interest him, and with the firm determination to complete it come what may, he faces his work to “make contributions to society as my mark I leave on the world.” Toward his juniors, Eki is a boss who can delegate responsibilities and guide them so that they pave their ways toward their own goals. He explains his principle, “When they are at an impasse, I give them some advice or, if it is out of my expertise, refer them to someone who may be able to help. My aim is for them to be capable of creativity while leveraging the knowledge they can gain from others throughout the Group network.” It is exactly what he experienced through the development of intelligent vision sensor, in seeking help from colleagues and understanding reasons for failures to come up with solutions, and he is passing these on to his team members. The team still has a long way to go, but among them, clearly, the SSS’s culture of creating the world’s first is alive and kicking.

Related links