Discover the new easier way to develop Kurento video applications

Kurento API

Kurento Media Server can be controlled through the API it exposes, so application developers can use high level languages to interact with it. The Kurento project already provides Kurento Client implementations of this API for several platforms.

If you prefer a programming language different from the supported ones, you can implement your own Kurento Client by using the Kurento Protocol, which is based on WebSocket and JSON-RPC.

In the following sections we will describe the Kurento API from a high-level point of view, showing the media capabilities exposed by Kurento Media Server to clients. If you want to see working demos using Kurento, please refer to the Tutorials section.

Media Elements and Media Pipelines

Kurento is based on two concepts that act as building blocks for application developers:

  • Media Elements. A Media Element is a functional unit performing a specific action on a media stream. Media Elements are a way of every capability is represented as a self-contained “black box” (the Media Element) to the application developer, who does not need to understand the low-level details of the element for using it. Media Elements are capable of receiving media from other elements (through media sources) and of sending media to other elements (through media sinks). Depending on their function, Media Elements can be split into different groups:

    • Input Endpoints: Media Elements capable of receiving media and injecting it into a pipeline. There are several types of input endpoints. File input endpoints take the media from a file, Network input endpoints take the media from the network, and Capture input endpoints are capable of capturing the media stream directly from a camera or other kind of hardware resource.
    • Filters: Media Elements in charge of transforming or analyzing media. Hence there are filters for performing operations such as mixing, muxing, analyzing, augmenting, etc.
    • Hubs: Media Objects in charge of managing multiple media flows in a pipeline. A Hub contains a different HubPort for each one of the Media Elements that are connected. Depending on the Hub type, there are different ways to control the media. For example, there is a Hub called Composite that merges all input video streams in a unique output video stream, with all inputs arranged in a grid.
    • Output Endpoints: Media Elements capable of taking a media stream out of the Media Pipeline. Again, there are several types of output endpoints, specialized in files, network, screen, etc.
  • Media Pipeline: A Media Pipeline is a chain of Media Elements, where the output stream generated by a source element is fed into one or more sink elements. Hence, the pipeline represents a “pipe” capable of performing a sequence of operations over a stream.

    Media Pipeline example

    Example of a Media Pipeline implementing an interactive multimedia application receiving media from a WebRtcEndpoint, overlaying and image on the detected faces and sending back the resulting stream

The Kurento API is Object-Oriented. This means that it is based on Classes that can be instantiated in the form of Objects; these Objects provide properties that are a representation of the internal state of the Kurento server, and methods that expose the operations that can be performed by the server.

The following class diagram shows some of the relationships of the main classes in the Kurento API:

digraph mediaobjects {
  bgcolor = "transparent";
  fontname = "Bitstream Vera Sans";
  fontsize = 8;
  size = "12,8";

  node [
    fillcolor = "#E7F2FA";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
    shape = "record";
    style = "filled";
  ]

  edge [
    arrowtail = "empty";
    dir = "back";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
  ]

  "MediaObject" [
    label = "{MediaObject|"
      + "+ getMediaPipeline() : MediaPipeline\l"
      + "+ getParent() : MediaObject[]\l}";
    labelurl = "MediaObject";
  ]

  "MediaElement" [
    label = "{MediaElement|"
      + "+ connect(...) : void\l"
      + "+ getMediaSinks(...) : MediaSink[]\l"
      + "+ getMediaSrcs(...) : MediaSource[]\l}";
    urllabel = "MediaElement";
  ]

  "MediaObject" -> "MediaPipeline";
  "MediaObject" -> "MediaElement";
  "MediaObject" -> "Hub";

  "MediaObject" -> "MediaObject" [label = "parent", constraint = false, dir = normal, arrowhead = "vee"];

  "MediaObject" -> "MediaPipeline" [label = "pipeline", constraint = false, dir = normal, arrowhead = "vee"];

  "MediaPipeline" -> "MediaElement" [headlabel = "*" label = "elements", constraint = false, dir = normal, arrowhead = "vee"];

  "MediaElement" -> "Endpoint";
  "MediaElement" -> "Filter";
  "MediaElement" -> "HubPort";

  "Hub" -> "HubPort" [headlabel = "*", constraint = false, dir = normal, arrowhead = "vee"];
}

Class diagram of main classes in Kurento API

Endpoints

A WebRtcEndpoint is an input/output endpoint that provides media streaming for Real Time Communications (RTC) through the web. It implements WebRTC technology to communicate with browsers.

../_images/WebRtcEndpoint.png

An RtpEndpoint is an input/output endpoint that provides bidirectional content delivery capabilities with remote networked peers, through the RTP protocol. It uses SDP for media negotiation.

../_images/RtpEndpoint.png

An HttpPostEndpoint is an input endpoint that accepts media using HTTP POST requests like HTTP file upload function.

../_images/HttpPostEndpoint.png

A PlayerEndpoint is an input endpoint that retrieves content from file system, HTTP URL or RTSP URL and injects it into the Media Pipeline.

../_images/PlayerEndpoint.png

A RecorderEndpoint is an output endpoint that provides function to store contents in reliable mode (doesn’t discard data). It contains Media Sink pads for audio and video.

../_images/RecorderEndpoint.png

The following class diagram shows the relationships of the main endpoint classes:

digraph endpoints {
  bgcolor = "transparent";
  fontname = "Bitstream Vera Sans";
  fontsize = 8;
  size = "12,8";

  node [
    fillcolor = "#E7F2FA";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
    shape = "record";
    style = "filled";
  ]

  edge [
    arrowtail = "empty";
    dir = "back";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
  ]

  "MediaElement" -> "Endpoint";
  "Endpoint" -> "SessionEndpoint";
  "Endpoint" -> "UriEndpoint";

  "SessionEndpoint" -> "HttpEndpoint";
  "SessionEndpoint" -> "SdpEndpoint";

  "HttpEndpoint" -> "HttpPostEndpoint";

  "SdpEndpoint" -> "RtpEndpoint";
  "SdpEndpoint" -> "WebRtcEndpoint";

  "UriEndpoint" -> "PlayerEndpoint";
  "UriEndpoint" -> "RecorderEndpoint";
}

Class diagram of main Endpoints in Kurento API

Filters

Filters are MediaElements that perform media processing, Computer Vision, Augmented Reality, and so on.

The ZBarFilter filter detects QR and bar codes in a video stream. When a code is found, the filter raises a CodeFoundEvent. Clients can add a listener to this event to execute some action.

../_images/ZBarFilter.png

The FaceOverlayFilter filter detects faces in a video stream and overlaid it with a configurable image.

../_images/FaceOverlayFilter.png

GStreamerFilter is a generic filter interface that allows injecting any GStreamer element into a Kurento Media Pipeline. Note however that the current implementation of GStreamerFilter only allows single elements to be injected; one cannot indicate more than one at the same time; use several GStreamerFilters if you need to inject more than one element at the same time.

../_images/GStreamerFilter.png

The following class diagram shows the relationships of the main filter classes:

digraph filters {
  bgcolor = "transparent";
  fontname = "Bitstream Vera Sans";
  fontsize = 8;
  size = "12,8";

  node [
    fillcolor = "#E7F2FA";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
    shape = "record";
    style = "filled";
  ]

  edge [
    arrowtail = "empty";
    dir = "back";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
  ]

  "MediaElement" -> "Filter";
  "Filter" -> "ZBarFilter";
  "Filter" -> "FaceOverlayFilter";
  "Filter" -> "GStreamerFilter";
}

Class diagram of main Filters in Kurento API

Hubs

Hubs are media objects in charge of managing multiple media flows in a pipeline. A Hub has several hub ports where other Media Elements are connected.

Composite is a hub that mixes the audio stream of its connected inputs and constructs a grid with the video streams of them.

../_images/Composite.png

DispatcherOneToMany is a Hub that sends a given input to all the connected output HubPorts.

../_images/DispatcherOneToMany.png

Dispatcher is a hub that allows routing between arbitrary input-output HubPort pairs.

../_images/Dispatcher.png

The following class diagram shows the relationships of the hubs:

digraph hubs {
  bgcolor = "transparent";
  fontname = "Bitstream Vera Sans";
  fontsize = 8;
  size = "12,8";

  node [
    fillcolor = "#E7F2FA";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
    shape = "record";
    style = "filled";
  ]

  edge [
    arrowtail = "empty";
    dir = "back";
    fontname = "Bitstream Vera Sans";
    fontsize = 8;
  ]

  "MediaObject" -> "Hub";
  "MediaObject" -> "MediaElement";

  "Hub" -> "HubPort" [headlabel = "*", constraint = false, dir = normal, arrowhead = "vee", labelangle = 60];

  "MediaElement" -> "HubPort";

  "Hub" -> "Composite";
  "Hub" -> "Dispatcher";
  "Hub" -> "DispatcherOneToMany";
}

Class diagram of main Hubs in Kurento API