Warning

Kurento is a low-level platform to create WebRTC applications from scratch. You will be responsible of managing STUN/TURN servers, networking, scalability, etc. If you are new to WebRTC, we recommend using OpenVidu instead.

OpenVidu is an easier to use, higher-level, Open Source platform based on Kurento.

Kurento Modules

Kurento Media Server is controlled through the API it exposes, so application developers can use high level languages to interact with it. The Kurento project already provides SDK implementations of this API for several platforms: Client API Reference.

If you prefer a programming language different from the supported ones, you can implement your own Kurento Client by using the Kurento Protocol, which is based on WebSocket and JSON-RPC.

In the following sections we will describe the Kurento API from a high-level point of view, showing the media capabilities exposed by Kurento Media Server to clients. If you want to see working demos using Kurento, please refer to the Tutorials section.

Media Elements and Media Pipelines

Kurento is based on two concepts that act as building blocks for application developers:

  • Media Elements. A Media Element is a functional unit performing a specific action on a media stream. Media Elements are a way of every capability is represented as a self-contained “black box” (the Media Element) to the application developer, who does not need to understand the low-level details of the element for using it. Media Elements are capable of receiving media from other elements (through media sources) and of sending media to other elements (through media sinks). Depending on their function, Media Elements can be split into different groups:

    • Input Endpoints: Media Elements capable of receiving media and injecting it into a pipeline. There are several types of input endpoints. File input endpoints take the media from a file, Network input endpoints take the media from the network, and Capture input endpoints are capable of capturing the media stream directly from a camera or other kind of hardware resource.

    • Filters: Media Elements in charge of transforming or analyzing media. Hence there are filters for performing operations such as mixing, muxing, analyzing, augmenting, etc.

    • Hubs: Media Objects in charge of managing multiple media flows in a pipeline. A Hub contains a different HubPort for each one of the Media Elements that are connected. Depending on the Hub type, there are different ways to control the media. For example, there is a Hub called Composite that merges all input video streams in a unique output video stream, with all inputs arranged in a grid.

    • Output Endpoints: Media Elements capable of taking a media stream out of the Media Pipeline. Again, there are several types of output endpoints, specialized in files, network, screen, etc.

  • Media Pipeline: A Media Pipeline is a chain of Media Elements, where the output stream generated by a source element is fed into one or more sink elements. Hence, the pipeline represents a “pipe” capable of performing a sequence of operations over a stream.

    Media Pipeline example

    Example of a Media Pipeline implementing an interactive multimedia application receiving media from a WebRtcEndpoint, overlaying an image on the detected faces and sending back the resulting stream

The Kurento API is Object-Oriented. This means that it is based on Classes that can be instantiated in the form of Objects; these Objects provide properties that are a representation of the internal state of the Kurento server, and methods that expose the operations that can be performed by the server.

The following class diagram shows part of the main classes in the Kurento API:

digraph mediaobjects { bgcolor = "transparent"; fontname = "Bitstream Vera Sans"; fontsize = 8; size = "12,8"; node [ fillcolor = "#E7F2FA"; fontname = "Bitstream Vera Sans"; fontsize = 8; shape = "rect"; style = "filled"; ] edge [ arrowtail = "empty"; dir = "back"; fontname = "Bitstream Vera Sans"; fontsize = 8; ] "MediaObject" [ label = "{MediaObject|" + "+ getMediaPipeline() : MediaPipeline\l" + "+ getParent() : MediaObject[]\l}"; labelurl = "MediaObject"; ] "MediaElement" [ label = "{MediaElement|" + "+ connect(...) : void\l" + "+ getMediaSinks(...) : MediaSink[]\l" + "+ getMediaSrcs(...) : MediaSource[]\l}"; urllabel = "MediaElement"; ] "MediaObject" -> "MediaPipeline"; "MediaObject" -> "MediaElement"; "MediaObject" -> "Hub"; "MediaObject" -> "MediaObject" [label = "parent", constraint = false, dir = normal, arrowhead = "vee"]; "MediaObject" -> "MediaPipeline" [label = "pipeline", constraint = false, dir = normal, arrowhead = "vee"]; "MediaPipeline" -> "MediaElement" [headlabel = "*" label = "elements", constraint = false, dir = normal, arrowhead = "vee"]; "MediaElement" -> "Endpoint"; "MediaElement" -> "Filter"; "MediaElement" -> "HubPort"; "Hub" -> "HubPort" [headlabel = "*", constraint = false, dir = normal, arrowhead = "vee"]; }

Class diagram of main classes in Kurento API

Endpoints

WebRtcEndpoint: Input/output endpoint that provides media streaming for Real Time Communications (RTC) through the web. It implements WebRTC technology to communicate with browsers.

../_images/WebRtcEndpoint.png

RtpEndpoint: Input/output endpoint that provides bidirectional content delivery capabilities with remote networked peers, through the RTP protocol. It uses SDP for media negotiation.

../_images/RtpEndpoint.png

HttpPostEndpoint: Input endpoint that accepts media using HTTP POST requests like HTTP file upload function.

../_images/HttpPostEndpoint.png

PlayerEndpoint: Input endpoint that retrieves content from file system, HTTP URL or RTSP URL and injects it into the Media Pipeline.

../_images/PlayerEndpoint.png

RecorderEndpoint: Output endpoint that provides function to store contents in reliable mode (doesn’t discard data). It contains Media Sink pads for audio and video.

../_images/RecorderEndpoint.png

The following class diagram shows the main endpoint classes:

digraph endpoints { bgcolor = "transparent"; fontname = "Bitstream Vera Sans"; fontsize = 8; size = "12,8"; edge [ arrowtail = "empty"; dir = "back"; fontname = "Bitstream Vera Sans"; fontsize = 8; ] node [ fillcolor = "#E7F2FA"; fontname = "Bitstream Vera Sans"; fontsize = 8; shape = "rect"; style = "dashed"; ] "MediaObject" -> "MediaElement"; "MediaElement" -> "Endpoint"; "Endpoint" -> "SessionEndpoint"; "Endpoint" -> "UriEndpoint"; "SessionEndpoint" -> "HttpEndpoint"; "SessionEndpoint" -> "SdpEndpoint"; "SdpEndpoint" -> "BaseRtpEndpoint"; node [ style = "filled" ] "HttpEndpoint" -> "HttpPostEndpoint"; "BaseRtpEndpoint" -> "RtpEndpoint"; "BaseRtpEndpoint" -> "WebRtcEndpoint"; "UriEndpoint" -> "PlayerEndpoint"; "UriEndpoint" -> "RecorderEndpoint"; }

Class diagram of Kurento Endpoints. In blue, the classes that a final API client will actually use.

Filters

Filters are MediaElements that perform media processing, Computer Vision, Augmented Reality, and so on.

ZBarFilter: Detects QR and bar codes in a video stream. When a code is found, the filter raises a CodeFoundEvent. Clients can add a listener to this event to execute some action.

../_images/ZBarFilter.png

FaceOverlayFilter: Detects faces in a video stream and overlays them with a configurable image.

../_images/FaceOverlayFilter.png

GStreamerFilter: Generic filter interface that allows injecting any GStreamer element into a Kurento Media Pipeline. Note however that the current implementation of GStreamerFilter only allows single elements to be injected; one cannot indicate more than one at the same time. Use several GStreamerFilters if you need to inject more than one element at the same time.

../_images/GStreamerFilter.png

Usage of some popular GStreamer elements requires installation of additional packages. For example, overlay elements such as timeoverlay or textoverlay require installation of the gstreamer1.0-x package, which will also install the Pango rendering library.

The following class diagram shows the main filter classes:

digraph filters { bgcolor = "transparent"; fontname = "Bitstream Vera Sans"; fontsize = 8; size = "12,8"; edge [ arrowtail = "empty"; dir = "back"; ] node [ fillcolor = "#E7F2FA"; fontname = "Bitstream Vera Sans"; fontsize = 8; shape = "rect"; style = "dashed"; ] "MediaObject" -> "MediaElement" -> "Filter"; "GstZBar" node [ style = "filled" ] "Filter" -> "FaceDetector" -> "FaceOverlay"; "Filter" -> "ImageOverlay" -> "FaceOverlay"; "Filter" -> "LogoOverlay"; "Filter" -> "MovementDetector"; "Filter" -> "GStreamerFilter"; "Filter" -> "OpenCVFilter"; "Filter" -> "ZBarFilter"; "GstZBar" -> "ZBarFilter"; }

Class diagram of Kurento Filters. In blue, the classes that a final API client will actually use.

Hubs

Hubs are media objects in charge of managing multiple media flows in a pipeline. A Hub has several hub ports where other Media Elements are connected.

Composite: Mixes the audio stream of its connected inputs and constructs a grid with the video streams of them.

../_images/Composite.png

DispatcherOneToMany: Sends a given input to all the connected output HubPorts.

../_images/DispatcherOneToMany.png

Dispatcher: Routes between arbitrary input-output HubPort pairs.

../_images/Dispatcher.png

The following class diagram shows the Hub classes:

digraph hubs { bgcolor = "transparent"; fontname = "Bitstream Vera Sans"; fontsize = 8; size = "12,8"; edge [ arrowtail = "empty"; dir = "back"; fontname = "Bitstream Vera Sans"; fontsize = 8; ] node [ fillcolor = "#E7F2FA"; fontname = "Bitstream Vera Sans"; fontsize = 8; shape = "rect"; style = "dashed"; ] "MediaObject" -> "Hub"; "MediaObject" -> "MediaElement"; node [ style = "filled" ] "MediaElement" -> "HubPort"; "Hub" -> "HubPort" [headlabel = "*", constraint = false, dir = normal, arrowhead = "vee", labelangle = -70.0, labeldistance = 0.9]; "Hub" -> "AlphaBlending"; "Hub" -> "Composite"; "Hub" -> "Dispatcher"; "Hub" -> "DispatcherOneToMany"; "Hub" -> "Mixer"; }

Class diagram of Kurento Hubs. In blue, the classes that a final API client will actually use.

Example Modules

In addition to the base features, there are some additional example modules provided for demonstration purposes:

Kurento modules architecture

Kurento modules architecture Kurento Media Server can be extended with example modules (chroma, crowddetector, platedetector, pointerdetector) and also with other custom modules.

These example modules are provided to show how to extend the base features of Kurento Media Server:

  • Chroma: Takes a color range from the top-left area of the video, and makes it transparent, revealing another background image.

  • CrowdDetector: Detects groups of people in video streams.

  • PlateDetector: Detects vehicle license plates in video streams.

  • PointerDetector: Detects pointers in video streams, based on color tracking.

Warning

These example modules are just prototypes and their results are not necessarily accurate or reliable. You can use them as programming guideline, but we strongly discourage anyone from using them in production environments.

All example modules come already preinstalled in the Kurento Docker images. For local installations, they can be installed separately with apt-get.

Taking into account these extra modules, the complete Kurento toolbox is extended as follows:

Extended Kurento Toolbox

Extended Kurento Toolbox The basic Kurento toolbox (left side of the picture) is extended with more Computer Vision and Augmented Reality filters (right side of the picture) provided by the example modules.

If you want to write your own modules, please read the section about Writing Kurento Modules.