Vision system for Poppy

On the current Poppy’s head design, the vision is ensured by two PS3 eye cameras. But this solution is not really convenient because:

  • you have to hack the PS3 eye by removing the plastic shell:
  • the usb cable is about 2m long, so you have to cut it and add an usb connector by hand.
  • the driver is more or less complicated to find depending on your OS.
  • the lens is big and too visible on the Poppy’s head leading to a misinterpretation of the head design. Indeed, people tends (logically) to associate the camera with the eyes of the robot while we want to display eyes on the screen:

New vision system for Poppy I:

For the next Poppy version, I am thinking to replace the PS3 Eye by the kickstarter project Pixy:

This camera is really interesting:

  • Some vision extracting features are directly computed on the board. So Pixy streams only useful information like the position of tracked faces, color blobs or specific objects which can then be processed by a simple Arduino board.
  • Yet, if you need it, you can still get the stream of the actual video through the usb port.
  • It’s Open source (hardware and software)
  • The lens is pretty big but it’s a M12 lens so you can easily change it, for example by choosing a pinhole lens.

I already ordered two of them and we should receive them in the incoming weeks.

Do you have any comment about this solution ? or interesting alternative ?

Hi! I recommend Pixy! I have already ordered one from Kickstarter (waiting for shipping to Turkey). However face recognition is not supported by Pixy currently. Still it has great abilities and it is very easy to integrate to any controller!
As an another option, I recommend OpenCV based system which could lead more people to join the project.

You are right, the face detection is not yet implemented but they say it should arrive before the end of the year:

Can Pixy do face tracking, face detection, or facial recognition?

Facial recognition is a more difficult problem than face detection, so let’s just focus on face detection (for now). Pixy will not ship with face detection/tracking functionality, but since we’ve had such a huge response for this functionality, we’re committed to bringing it to you as soon as possible. We’re looking at a few possible algorithms. We have experience with Viola-Jones and have seen good results (http://www.cmucam.org/projects/cmucam3/wiki/Viola-jones)
But-- yes, expect this to come out before the end of the year!

For other option, it would be for sure OpenCV based, but the problem is more hardware. Until now, I did not find another suitable camera …

What about making a clone of Pixy :slight_smile: Pixy is actually a computer with an integrated good camera on it.

Here is my suggestion:
We will use a pcDuino (http://www.pcduino.com) and an HD camera connected via USB.
pcDuino has an easy access to IO Pins which include I2C, SDA, UART, PWM, DAC and Digital Pins!
Pyton or C++ can be used to create a software that is fully customized for vision of Poppy by using OpenCV.

Later on, voice recognition, speech synthesis, IMU integration or sound direction detection could be implemented on same hardware and it turns into a head module!
Using this way, it is easier to customize and upgrade the vision system. However, I know this way is harder, longer and more expensive than simply using Pixy! :blush:

We have just received our Pixy !

The current documentation on the Pixy website only offer a 2D drawing of the pixy board. Thus the first thing I did was to put it in 3D using the actual Pixy and a caliper. I tried to make it as precise as possible to allow a perfect integration in the Poppy head.

As this work can be useful to others, I put it on GrabCAD with an open source licence CC-BY-SA.

https://grabcad.com/library/pixy-camera-cmucam5-1

pixy-camera-cmucam5.zip (11.2 MB)

It should be soon be also available on the Pixy website.

Hi! I have mine yesterday! :slight_smile: Good job with 3D model and thanks for share.
Now, Lets see Pixy on Poppy… :wink:

You can try http://muq.org/~cynbe/vision-apps/ it works with most of camera. (I have tested it with Cynbe years ago)

So some news:

1- We will not supply the Pixy camera as the main Poppy camera because:

  • it is a bit too large
  • cannot be plugged by usb and SPI at the same times
  • the lens is too big and we did not find a good replacement system yet

Yet we still keep compatibility with our electronic I/O board to let anyone use it if wanted.

2- Until now we planned to use the CAM8100 camera module:

  • Really small and easy to integrate
  • Small usb connector
  • Directly works on linux

But we’ve just learned le production will be discontinued and replaced by the CAM8200 which will not be embedable because of its way bigger size.

##3- Now we have to find a new camera for Poppy v1…

This camera should involve:

  • USB connection
  • a small PCB
  • a tiny lens to not being visible on the Poppy face (the screen is the “expressive eyes” not the camera)
  • easily accesible (you can buy it over the internet)

After deeply browsing on amazon I just found several new solutions !

Yet I did not investigate more and I will just keep the track here for referencing my discoveries:

  1. This company seems to have plenty of small camera, especially this camera module.
  2. There are (really) plenty of HP camera module for laptop compatible with our design. We would just have to find the best reference for quality and availability.
  3. The following nice and small usb camera boards, a HD Fish eye and 5MP-autofocus. But I don’t know how many of them are available …

Yes, for an eye, just hide it with sunglass either tinting or mirroring to hide the lens in it’s head.

First I thought, yes one eye is a good idea and more simple.
But since some times I play with ICub project I remembered that the bot have 2 “true” eyes. I mean with a “true” different vision between each eyes.
So, why do we have 2 eyes rather than one?


I understand that mean lot of works, but yarp already make lot of the stuff.

There is also http://arxiv.org/abs/0908.3359

And this could really help to explain clearly the interest of 2 eyes in robotics http://aplab.bu.edu/assets/download/PDFs/articles/SantiniRucci07.pdf

I think you’re over complicating it.

This is more like what we really need.

We’d use it differently, and not so much to animate Poppy. We go through the set up check list, and then Poppy, can recognize our facial expressions. That’s a form of Machine Vision, we can use and modify to our likings. We could throw a coke can there, label it, and then Poppy could find. It’s not a facial expression to recognize but, it is an object. We might use CAD, to draw the 3D image of the object we want the robot to recognize, just to generate a proper point cloud.

It’s just that two cameras put together, can resolve a more accurate point cloud, and as a result, allow more room for the robot avoid obstacles and object oriented missions. Cleaning a toilet isn’t easy. So, think about all of the cleaners, tools, etc. you need yourself to clean it, and then step by step with just the objects and targets required.

Bottom line, in electronics, let’s say, we have a missing capacitor but, we know what frequency the circuit is tuned to, there’s a means of working the equations to find that value. This is same thing is true about all of the math the Visual System of a Robot has to do but, in reverse. So, instead of projecting two images on a screen, edge detection, give’s a fair alignment for beginning a point cloud. The point cloud, is just the way a 3D gaming engine is being used. Instead of vertexes, all you have is dots in 3D space, then you’d have the computer start drawing the vertexes. Units of measure are known within the 3D. It’s full of all approximations, and they’re probably within an 1/8th of an inch, if you looked at the resolution of 3D. It would probably only need to be as good as Quake, or Quake II. But, instead of running the game, the AI is plotting a path through obstacles, and instead of the robot making all kinds of mistakes, if it’s been in the room, and all around in it, then it already has a fairly complete 3D map, and the only things it would notice is changes since the last time it was there. It wouldn’t even ask any questions just update the map. But, a gaming AI, is enough for navigation, once you have a 3D map of a room. Most of the software, already exists, it’s just that the pieces we need, aren’t all of the pieces that they need.

For now, the best example is ASIMO’s greeting/dismissal, where it just bows politely. He could stand too close, or too far away, and even though we feel it’s just being polite, we’re probably more lucky not to bump heads.

But, one of the first bugs that this software would have presented, is already a solution. The problem it would have had, just in it’s very beginnings as a program is really a solution we need. That would be when it first 3D mapped properly, it would have mapped the whole image. There’s the room we need for navigation.

Do you have an idea in the way to do it AND make it easily reproducible by anyone ?

What is wrong in what I have said that you all respond things like this?