New tutorial on the ESP32 where we are going to see the ESP32 module + camera, with which we can capture images, or do video streaming.
This ESP32 + camera module is one of the most popular and there are plenty of reasons for it. With it, we can easily carry out projects such as video surveillance, remote control of a robot, and even perform (modest) AI and visual recognition functionalities.
The heart of this module is our well-known ESP32 SoC from manufacturer Espressif. By now, we have repeated its characteristics a thousand times, but we summarize it as a 32-bit processor with 2 cores of up to 240Mhz, with WiFi and BT connectivity.
Regarding the camera, it is compatible with the OV2640 (2 mpx), OV3660 (3 mpx), and OV5640 (5 mpx) cameras from manufacturer Omnivision (OVT).
If you look for information on the manufacturer’s website, you won’t find it because both sensors have been discontinued since 2009 and 2011. But it is still possible to find the Datasheet, which I recommend you to do because it is very interesting to see the amount of electronics and processing that the camera itself carries, which is sold for 1-2€.
The available resolutions in the OV2640 model are QVGA (320 x 240), CIF (352 x 288), VGA (640 x 480), SVGA (800 x 600), XGA (1024 x 768), SXGA (1280 x 1024), and UXGA (1600 x 1200). OV3660 adds QXGA (2048 x 1536) and OV5640 QSXGA (2592 x 1944).
As for the fps that we can achieve, it logically depends on the camera we are using and the chosen resolution. In a very general way, below 640x480 we will achieve real time (25-30fps). With 800x600 we will notice a small decrease, and from there it begins to suffer. At 1600 x 1200 it is easy for us to have 3-5 fps.
As we mentioned earlier, the ESP32 Camera incorporates certain visual recognition and AI functions such as, for example, face recognition. These functions are available for low resolutions (< CIF).
It seems like a big limitation, but the truth is that most similar applications work by first downscaling… that is, they work at low resolutions.
As for the price of the set, we can find several modules that incorporate ESP32 + Camera. The best known is from the manufacturer AI-Thinker, with OV2640 camera, which we can find for about 4.5€.
This AI-Thinker board has the disadvantage of not incorporating a USB to UART converter. However, many sellers include a base with a converter for about 0.80€, which greatly facilitates the programming and handling of the module. I recommend buying a base, which you can then share and use between several modules.
If you don’t want to buy the base, it’s okay. You can use an external USB to UART converter, connected to the U0T and U0X pins. But in the end you will pay more for less, and you will have an unnecessary cable mess
Testing the ESP32 Camera
So, if we want to start testing the ESP32 + Camera module, where do we start? By doing videostream over WiFi? It looks good, but it seems very complicated.
Well no, because fortunately, the ESP32 library for the Arduino IDE incorporates a very complete example for the ESP32 as a camera. In fact, it is not only very complete, it is one of the best examples of the ESP32.
To make it work, we simply choose our hardware.
We load the example ESP32\Camera\Camera webserver
We modify the WiFi connection data for our network, load the program into the module, and connect with a web browser to the IP given to it by the router, and it will show us through the serial port and … hey! There’s my hand.
The resolution of the captured images and the obtained fps will also be shown through the serial port. On the left of the example web page, there is a configuration bar that allows modifying all the options (and there are many!) of the camera connected to the ESP32.
As I said, the example is very very complete, and I recommend that you read the code because you can learn a lot about both the ESP32 + Camera module, and how to perform communication in general.
And that’s it for this post. I leave you on Github the code of the example (which is the same as you have in the official ESP32 repo, nothing new here) and the served web page, extracted from the binary code of the example. I recommend you take a look, and see you in the next post.