Red5 Pro Stream Manager User Guide

Red5 Pro Stream Manager is a streaming architecture management and information service which helps automate the process of creating and deleting Red5 Pro server instances. Stream Manager also coordinates between broadcasters and subscribers to help the find the right servers for their broadcast and subscribe actions respectively.

Red5 Pro Stream Manager provides you with accurate stream statistics over simple HTTP-based REST API calls, once stream publishing has started for the stream. It provides automatic traffic management using the Red5 Pro Autoscaler component. Your server fleet automatically expands and contracts as traffic increases and decreases over time. This reduces unnecessary server usage, thereby reducing your cloud platform bills. Stream Manager also monitors the Red5 Pro service on each node and replaces faulty nodes to prevent service disruptions.


Concepts

Node

A Node refers to a single server instance on the streaming architecture. Each Node may belong to a NodeGroup with a specific instance role assigned to it.

A Node is configured using the launch configuration defined in the NodeGroup that it belongs to. The lifecycle of a Node is tied to the NodeGroup that it belongs to. Once the group is set to be deleted, each node belonging to the group is deleted as well.

Node Types

Autoscaling defines different roles to Red5 Pro instances to make them efficient in contributing towards the streaming architecture. Currently Red5 Pro autoscaling supports the following node roles:

  • ORIGIN: Origin nodes are responsible for providing an ingest point to publishers.
  • EDGE: Edge nodes offer egress points to subscribers.
  • RELAY: Relay nodes act as stream repeaters. They are very useful when building large scale streaming systems that span across continents.The cost & performance of having each edge connect to an origin in a different part of the globe, can be optimized by setting up a relay locally in that continent/geozone that repeats the origin stream and having edges in the continent/geozone read the stream from the relay instead.
  • TRANSCODER: Transcoder nodes are currently responsible for generating multi bitrate outputs for a single stream.However expect a lot more fun from them in future, such as adding overlays, merging streams, etc.

NodeGroup

NodeGroup is a concept of virtually categorizing one or more nodes into a group. Each nodegroup is identified uniquely using a group id or name.Each group can have one or more nodes types in it at targeted regions.

There are two types of NodeGroups that can be established using Stream Manager for autoscaling:

TIER 1 Groups : Tier 1 nodegroups are the common type of nodegroups that consist of one or more origin(s), one or more edge(s) and optionally one or more transcoder(s).

TIER 2 Groups : Tier 2 nodegroups are meant for large scale (both geographically and considering number of publisher/subscriber) deployments. These nodegroups consist of one or more origin(s), one or more edge(s), one or more relays(s) and optionally one or more transcoder(s).

Each group also has an associated launch configuration and a scale policy defined which provides the details of what to scale, how to scale, where to scale etc.

Throughout this document we may use the term NodeGroup of cluster interchangeably to describe a group of Red5 Pro nodes in the context of autoscaling.

GeoZones And Regions

For better geographical management of streaming service, Stream Manager Allows you to categorize node availability in three layers of location tiers - geozone, region and availabilityZone. While the concept of regions and availability zones is common to popular cloud platforms, Stream Manager, additionally also allows you to group regions into geozones.

  • geozone : Geozone is a macro location concept. Geozones are used to group one or more regions together. For example if you have two regions in asia called east-asia-1 and west-asia-a, they can be grouped into a geozone called asia.

  • region : A region would be the second layer of geographical location abstraction which provides services, nodes etc. Example: us-east-1 or us-west-2 etc:

  • availabilityZone : An availability zone in terms of a cloud environment is exactly as it is defined in the context (isolated locations / Data centers). For simulated cloud environments, availability zones are virtual / made-up values which are not actual locations.

Once Stream Manager has the knowledge of your geozones and regions, you can then specify a geozone code instead of a region code in the Read Stream api as a request parameter. Stream Manager will automatically know which regions you are targeting via the geozone code. For more information take a look at the Read Stream api. In other words, you can use geozones to request origins/edges from multiple regions at once.

Geozones and Regions are applicable for both cloud based environments and simulated cloud environments. Simulated cloud environments can define their own regions and group them into custom geozones.

Red5 Pro Stream Manager

Red5 Pro Stream Manager is a web application which encapsulates and manages various responsibilities of a Red5 Pro auto scaling architecture such as:

  • Cloud platform instance deployment
  • Node management
  • NodeGroup management
  • NodeGroup autoscaling
  • Coordinating publishers and subscribers to appropriate node endpoints for stream requests
  • Clustering, declustering & monitoring
  • Gathering & aggregating traffic information across active nodes.

Red5CloudWatch

Red5CloudWatch is a subcomponent within Red5 Pro Stream Manager which acts as a communication bridge between an active Red5 Pro node and the stream manager.

Technically it is a Java servlet which formats and relays incoming HTTP/HTTPS communication from a Red5 Pro Node to the Stream Manager Red5CloudWatchClient.

The Red5CloudWatch relays important notifications such as:

  • Cluster Report: NodeGroup statistics about load and streams
  • Stream publish: Stream publish start notification from the origin
  • Stream unpublish: Stream unpublish notification from the origin
  • Node role assignment: Request by a newly initialized node to acquire its role in the group & operational configuration information.

Red5CloudWatchClient

Red5CloudWatchClient is a subcomponent inside Red5 Pro Stream Manager which is responsible for processing notifications relayed by the Red5CloudWatch servlet.

The Red5CloudWatchClient evaluates & processes various autoscaling alarm conditions and dispatches appropriate notification to the autoscaler component for each NodeGroup separately.

CloudSweeper

CloudSweeper is a internal cron process that checks for lost nodes on the target platform. Lost nodes refer to instances that exist on the platform originating from the current Stream Manager but are not registered in the Stream Manager database. Suspected lost nodes are initially added to a blacklist and then removed from the platform using the platform API. You can configure the cloudsweeper runtime via the instancecontroller.cloudCleanupInterval property in the red5-web.properties file.

DeadNodeCleaner

DeadNodeCleaner is an internal cron processes that check for stranded nodes, stranded node groups and unresponsive nodes. Streammanager automatically configures the run-time for this cron process using the cluster.reportingSpeed property value from the CLUSTER CONFIGURATION SECTION located in the red5-web.properties file.

NodeSynchronizer

NodeSyncronizer is an internal one time task which runs when stream manager starts up and attempts to sync the IP addresses of nodes in database with those running on the cloud platform. Instances not found on the platform are removed from the data store.

Autoscaler

The Autoscaler component in Red5 Pro Stream Manager handles autoscaling activities. The Autoscaler receives alarm notifications from the Red5CloudWatchClient accompanied by any necessary data.It then uses information from the launch configuration and scale policy for the group to perform autoscaling operation as dictated by the alarm.

Once an autoscale operation is launched successfully, the Autoscaler blocks additional autoscaling activities by ignoring additional alarms until the newly launched node has settled into the cluster.The autoscaler is responsible for matching alarms with appropriate action logics required to update the nodegroup.

Launch Configuration

A launch configuration is a configuration definition designed to help the Red5 Pro Stream Manager launch instances on the cloud platform. In simpler words a launch configuration helps with how to launch an instance.

Configurations are stored on Stream Manager data store. Launch configuration defines an important set of parameters such as Red5 Pro image name, machine type to use, estimated connection capacity of the instance, arbitrary metadata for the instance as well as arbitrary special properties for the cloud controller. etc. Parameters such as machine type, connection capacity is configured per instance role (origin/edge/relay/transcoder).

A stream manager auto scaling launch configuration template looks like this:

 {

     "launchconfig": {
        "name": "<configuration-name>",
        "description": "<configuration-description>",
        "image": "<red5pro-image>",
        "version": "0.0.3",

        "targets": {
            "target": [{
                "role": "<role>",
                "instanceType": "<instance-type>",
                "connectionCapacity": "<instance-capacity>"
            }]
        },

        "properties": {
            "property": [{
                "name": "<property-name>",
                "value": "<property-value>"
           }]
        },
        "metadata": {
             "meta": [{
                "key": "<meta-name>",
                "value": "<meta-value>"
           }]
        }
    }
 }

For more information on working with launch configuration checkout Stream Manager API Documentation

Scale Policy

A Scale Policy defines simple rules for scaling a NodeGroup. Generally we have two types of scaling activities: scale-in (contracting a group) and scale-out (expanding a group). A scale policy helps with how to scale-in / scale-out a group. Policies are stored on Stream Manager data store.

Scale policy defines an important set of parameters such as minimum number of nodes, maximum number of nodes etc.Each of these parameters is configurable per instance role (origin/edge/relay/transcoder).

A scale policy template looks like this:

 {
    "policy": {
        "name": "<policy-name>",
        "description": "<policy-description>>",
        "type": "<policy-type>",
        "version": "<policy-version>",
        "targets": {
            "region": [{
            "name": "default",
                "target": [{
                    "role": "<role>",
                    "minLimit": "<min-node-count>",
                     "maxLimit": "<max-node-count>",
                    "scaleAdjustment": "<node-scale-adjustment>"
                 }]
            }]
        }
    }
 }

For more information on working with scale policies checkout Stream Manager API Documentation

Instance Warm Up Time

Instance Warm Up Time refers to the time required by a new Red5 Pro instance to reach a stable INSERVICE state from the time it was launched. It may very well depend on your choice of cloud platform's instance type offering and any special settings that your instance configuration may contain.

While you do not need to calculate this parameter explicitly, it is useful to know the approximate warm times for your target platform.

The configuration property instancecontroller.newNodePingTimeThreshold found in the stream manager configuration file red5-web.properties corresponds to the time taken for a node to ping Stream Manager for the first time since its launch. Which is kept sufficiently higher than the warmup time.

If a node fails to ping the Stream Manager within this time it is considered as a failed launch and the node is removed from the data store without any further wait.

Metric

Metrics are system attributes which influence scaling decisions - e.g., CPU usage, RAM usage, etc. In context of Red5 and Red5 Pro, important metrics would be: CPU usage, JVM memory, connections and streams.

Metrics are used to set Alarms internally by defining Thresholds. When a Metric value violates the defined threshold value it is termed as a "Threshold violation".

The current of Autoscaler concerns itself with only CONNECTION LOAD PERCENTAGE and hence the only Metric it deals with is CONNECTIONS.

Threshold

Threshold(s) are bounds for metric values. For example a upper threshold for connection for autoscaling may be defined as 80% which is the trigger point of an scale-out Alarm.

Thus if the connections load goes over 80%, it would indicate a threshold violation and would result in the respective Alarm being triggered. Threshold is encapsulated within an Alarm object as a part of the Alarm definition.

To see how to get or set alarm thresholds checkout Stream Manager API Documentation

Alarm

Alarms are objects that define a condition for Autoscaler. Each alarm is meant to monitor a metric for threshold violation against a defined threshold condition (upper or lower).

When a metric value breaches its defined threshold boundary, the system will trigger an event causing a notification to be sent from the alarm evaluator Red5CloudWatchClient to the Autoscaler component. The alarm may carry additional data about the reporting Node or NodeGroup and an action tag implying what action should be taken by the Autoscaler.

To see how to change alarm thresholds checkout Stream Manager API Documentation

Action

Actions are responses triggered by the Autoscaler due to an Alarm. Actions carry the actual autoscaling logic.

Stream Manager represents Actions using tags, and later maps these tags to appropriate action logic implementations. Each Action is associated with one Alarm at a time.

Sample Alarm Actions

  • GROUPEDGESCALEOUT: Implies that a edge scale out operation is required
  • NEWCLUSTERINITIALIZE: Implies that a cluster initialization operation is required

SIMULATED CLOUD

Stream Manager provides an intriguing feature for managing custom non-cloud instances called - Simulated Cloud.The Simulated Cloud is a concept of virtualizing an actual cloud platform using a set of custom apis and some smart programming.

The virtualization uses a simple datastore to manage instances and simulates the various states of an instance to resemble the states usually seen for a cloud instance (on aws, azure, google compute etc). The simulation of an actual cloud platform allows users to manage their own physical or virtual machines as if they were cloud instances using the Simulated Cloud Controller.

The simulated cloud platform can also use cloud instances as self managed instances and turn them into Simulated Cloud instances that can be managed using the Simulated Cloud Controller.

The only main difference to understand between cloud and simulated cloud instances is that while cloud instances can actually be created and destroyed on the respective platform, simulated cloud instances are usually always running instances, whose behaviour is encapsulated using a data store & apis to emulate creation and removal on the surface.

To summarize, if you have a groups of instances that you want to use for Red5 Pro streaming with auto-clustering and auto scaling features, you can manage them using the simulated cloud deployment.

To learn more about simulated cloud apis, operations and setup, please see the simulated cloud deployment guide.


BEST NODE SELECTION

Stream Manager has a built in mechanism for selecting & providing the best (optimal) node for an appropriate streaming operation. This mechanism allows stream manager to select an optimal node based on state of available resources on it.

Best node selection logic uses a metric-weight evaluation mechanism to determine the optimal candidate. When a publisher/subscriber request arrives on the Stream Manager requesting an origin/edge for publishing/subscribing, it creates a pool of nodes suitable for the request.Stream Manager then evaluates the node-score for each candidate node in the pool. The node having the best node-score is then returned to the client as per request.

The InstanceMetricsRuleManager component in Stream Manager uses predefined metric rules, where specified metrics are configured with predefined metric weight which project their importance relative to other metrics.

Each metric in the metric evaluation system can be put in one of the two categorizes - LOWERBETTER or HIGHERBETTER. For metric of type LOWERBETTER, a increase in value leads to decrease in weight and for a metric of type HIGHERBETTER increase in value causes increase in weight.

Currently all of defined metrics are of LOWERBETTER.

Stream Manager, internally deals with two types of metrics :

  • STATIC : Metrics that are directly available via node reports.
  • DYNAMIC : Metrics that are not directly available, but are derived using STATIC metrics.

Metrics that affect the best node selection algorithm are defined in the configuration file applicationContext.xml.

A current net weight or node score is evaluated for all candate nodes for a node type using all the metrics defined that influence the algorithm, the current value of each metric and assigned weight for the respective metric.The node score refers to the sum of all metric weights for a node. The node with the highest score is preferred over other candidate nodes.

DYNAMIC METRICS USED IN BEST NODE EVALUATION

Dynamic metrics are registered (for required node types) in the applicationContext.xml file (RED5_HOME/webapps/streammanager/WEB-INF/applicationContext.xml).

<bean id="freeConnectionSlotsMetric"
    class="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
    <property name="metricName">
        <value>clientCount</value>
    </property>
    <property name="unit">
        <value>PERCENTAGE</value>
    </property>
    <property name="minValue">
        <value>0</value>
    </property>
    <property name="maxValue">
        <value>100</value>
    </property>
    <property name="direction">
        <value>LOWERBETTER</value>
    </property>
    <property name="metricWeight">
        <value>${instanceevaluator.streams.metricweight}</value>
    </property>
</bean>

<bean id="streamCountMetric"
    class="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
    <property name="metricName">
        <value>publisherCount</value>
    </property>
    <property name="unit">
        <value>PERCENTAGE</value>
    </property>
    <property name="minValue">
        <value>0</value>
    </property>
    <property name="maxValue">
        <value>100</value>
    </property>
    <property name="direction">
        <value>LOWERBETTER</value>
    </property>
    <property name="metricWeight">
        <value>${instanceevaluator.streams.metricweight}</value>
    </property>
</bean>

<bean id="availableMemoryMetric"
    class="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
    <property name="metricName">
        <value>availableMemory</value>
    </property>
    <property name="unit">
        <value>PERCENTAGE</value>
    </property>
    <property name="minValue">
        <value>0</value>
    </property>
    <property name="maxValue">
        <value>100</value>
    </property>
    <property name="direction">
        <value>HIGHERBETTER</value>
    </property>
    <property name="metricWeight">
        <value>${instanceevaluator.memory.metricweight}</value>
    </property>
</bean>

<bean id="subscriberCountMetric"
    class="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
    <property name="metricName">
        <value>edgeSubscriberCount</value>
    </property>
    <property name="unit">
        <value>PERCENTAGE</value>
    </property>
    <property name="minValue">
        <value>0</value>
    </property>
    <property name="maxValue">
        <value>100</value>
    </property>
    <property name="direction">
        <value>LOWERBETTER</value>
    </property>
    <property name="metricWeight">
        <value>${instanceevaluator.subscribers.metricweight}</value>
    </property>
</bean>

<bean id="restreamerCountMetric"
    class="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
    <property name="metricName">
        <value>restreamerCount</value>
    </property>
    <property name="unit">
        <value>PERCENTAGE</value>
    </property>
    <property name="minValue">
        <value>0</value>
    </property>
    <property name="maxValue">
        <value>100</value>
    </property>
    <property name="direction">
        <value>LOWERBETTER</value>
    </property>
    <property name="metricWeight">
        <value>${instanceevaluator.restreamer.metricweight}</value>
    </property>
</bean>

<bean id="serverMetricsEvaluator" class="com.red5pro.services.streammanager.nodes.component.InstanceMetricsRuleManager">

    <property name="originMetricRules">
        <list   value-type="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
            <ref bean="freeConnectionSlotsMetric"></ref>  <!-- dynamically injected metrics -->
            <ref bean="streamCountMetric"></ref>  <!-- dynamically injected metrics -->
            <ref bean="subscriberCountMetric"></ref>  <!-- dynamically injected metrics -->
        </list>
    </property>

    <property name="edgeMetricRules">
        <list value-type="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
            <ref bean="freeConnectionSlotsMetric"></ref>  <!-- dynamically injected metrics -->
            <ref bean="availableMemoryMetric"></ref>  <!-- dynamically injected metrics -->
        </list>
    </property>


    <property name="relayMetricRules">
        <list value-type="com.red5pro.services.streammanager.nodes.metrics.MetricRule">
            <ref bean="restreamerCountMetric"></ref>  <!-- dynamically injected metrics -->
            <ref bean="availableMemoryMetric"></ref>  <!-- dynamically injected metrics -->
        </list>
    </property>

</bean>

Each of the above dynamic metrics is expressed in percentage. The weight value, supplied via configuration file during initialization is attached to the max possible value (100%) for that metric. As the connection, memory load of the node increases, the overall evaluated weight for it decreases. This is then compared with the evaluated weight of other candidate nodes in the category to select the best node.

Each metric rule property listed under the serverMetricsEvaluator java bean is groups individual dynamic metrics required for evaluation of node score for a specific node type.

  • To exclude a dynamic metric from origin score calculation, omit or comment-out the metric from the 'originMetricRules' list property in the serverMetricsEvaluator bean.
  • To exclude a dynamic metric from edge score calculation, omit or comment-out the metric from the 'edgeMetricRules' list property in the serverMetricsEvaluator bean.
  • To exclude a dynamic metric from relay score calculation, omit or comment-out the metric from the 'relayMetricRules' list property in the serverMetricsEvaluator bean.

Generally you do not need to edit anything in the applicationContext file. You can control most of the required settings from the red5-web.properties file.

Refer to METRIC WEIGHTS FOR BEST NODE EVALUATION SECTION discussed in the Stream Manager Configurable Properties topic.

LOAD BALANCED STREAMMANAGER

Stream manager application supports load balancing (at the moment for AWS only). This means that in an occasion of anticipated heavy traffic, you can set up more than one stream manager behind the cloud platform's load balancer service. This is ensure that traffic request such as (broadcast / subscribe) is evenly distributed between multiple stream manager instances to prevent flooding of requests on a single instance.

Multiple stream managers still interact with a single shared database. So if your traffic needs will be high make sure to set-up a higher configuration database instance to host the RDS.

Since the time synchronization between multiple stream managers is based on UTC it is important that the system clock of the stream manager instance be accurate and managed by a reliable NTP service. Administrator must setup NTP service on the VM instance prior to setting up streammanager.


Locating The Stream Manager Application

The Stream Manager application comes packaged with your Red5 Pro distribution. You can locate streammanager in the webapps directory of your Red5 Pro installation directory: {red5prohome}/webapps/streammanager Configurable files can be located inside the WEB-INF folder:{RED5_HOME}/webapps/streammanager/WEB-INF

NOTE:Any changes made to Stream Manager configuration files will require restarting the Red5 Pro server service.

Stream Manager Configurable Properties

red5-web.properties Configuration File

Properties given below can be configured in the Stream Manager configuration file red5-web.properties located at:{streammanager}/WEB-INF/classes/red5-web.properties.

DATABASE CONFIGURATION SECTION

config.dbHost={host}
config.dbPort=3306
config.dbUser={username}
config.dbPass={password}

config.dbHost

Configures the database host IP Address for RDS access. This property is required for proper functioning of the Stream Manager's database operations.

config.dbPort

Configures the database port for RDS access. This property is required for proper functioning of Stream Manager's database operations. The default value for this property is 3306 since MySQL servers are configured to run on 3306.

config.dbUser

Configures the database username for RDS access. This property is required for proper functioning of the Stream Manager's database operations. This must correspond to your database account access credentials.

config.dbPass

Configures the database password for RDS access. This property is required for proper functioning of Stream Manager's database operations. This must correspond to your database account access credentials.

NODE CONTROLLER CONFIGURATION SECTION

instancecontroller.newNodePingTimeThreshold=150000
instancecontroller.replaceDeadClusters=true
instancecontroller.deleteDeadGroupNodesOnCleanUp=true
instancecontroller.instanceNamePrefix=node
instancecontroller.nodeGroupStateToleranceTime=180000
instancecontroller.nodeStateToleranceTime=180000
instancecontroller.cloudCleanupInterval=180000
instancecontroller.blackListCleanUpTime=600000
instancecontroller.pathMonitorInterval=30000
instancecontroller.minimumNodeFreeMemory=50

instancecontroller.newNodePingTimeThreshold

Configures the maximum expected ping time of a newly launched Red5 Pro instance. This value is configured in milliseconds. This time takes into account the time required for instance startup and Red5 service boot up. New nodes that are unable to ping the stream manager within the expected time (newNodePingTimeThreshold) are assumed to be out of service or dead nodes.

instancecontroller.replaceDeadClusters

Configures whether to replace a dead Node Group with a new one or not. A Node Group is considered dead by the Stream Manager if its origin hasn't pinged Stream Manager for a long time (governed by instancecontroller.nodePingTimeThreshold).

  • If this property is set to true, Stream Manager replaces a dead Node Group with a new one with the same group configuration.
  • Setting this property to false ensures that dead clusters are cleaned up from the system without any replacement.

instancecontroller.deleteDeadGroupNodesOnCleanUp

Configures whether termination of nodes belonging to a dead Node Group implies permanently deleting the instances on the cloud platform or merely stopping them. A node group is considered dead by Stream Manager if its origin hasn't pinged Stream Manager for a long time (governed by instancecontroller.nodePingTimeThreshold).

  • If this property is set to true, Stream Manager ensures that instances of the dead Node Group are permanently deleted from the cloud platform.
  • Setting this property to false means that cloud instances of a dead Node Group will be stopped and not permanently deleted from the cloud platform.

instancecontroller.instanceNamePrefix

Configures the name that will pre-pend any automatically created nodes. The stream manager uses this to search for and remove any nodes that have been stopped or are not communicating with the Stream Manager and need to be removed and/or replaced. If you are hosting multiple autoscaling environments within a hosting platform (a single Google Compute Engine Project for example), it is critical that this name be unique across solutions because if a Stream Manager detects a host with the correct naming prefix that is not in its database, it will shut down that node.

instancecontroller.nodeGroupStateToleranceTime

Configures the net time (in milliseconds) to wait before declaring a node group as stranded. A stranded node group is commonly a group which is in a TERMINATING state for longer than instancecontroller.nodeGroupStateToleranceTime milliseconds.

instancecontroller.nodeStateToleranceTime

Configures the net time (in milliseconds) to wait before declaring a node as stranded. A stranded node is commonly a node which is in a state other than INSERVICE state for longer than instancecontroller.nodeStateToleranceTime milliseconds.

instancecontroller.cloudCleanupInterval

Configures the net time (in milliseconds) to run the CloudSweeper job. This is a process that checks for nodes running on cloud platform spawned from current streammanager but not registered in database.

instancecontroller.blackListToleranceTime

Configures the net time (in milliseconds) to tolerate a CloudSweeperdetected unwanted instance in a temporary blacklist buffer before it is terminated.

instancecontroller.blackListCleanUpTime

Configures the net time (in milliseconds) to clean up CloudSweeper blacklist. Blacklist cleanup accounts for any instance that was detected once but never detected henceforth. This may happen if a user manually deletes the instance on cloud.

instancecontroller.pathMonitorInterval

Time interval (in milliseconds) for path monitor cron. The path monitor cron runs as a part of dynamic clustering to check cluster paths between nodes periodically and attempts to fix broken paths. The default value is set to 30 seconds.

instancecontroller.minimumNodeFreeMemory

Minimum free memory (in MB) allowed on a node for it to be considered as usable. If the free memory drops below this value traffic will not be forwarded to that node.

METRIC WEIGHTS FOR BEST NODE EVALUATION SECTION

instanceevaluator.streams.metricweight=30
instanceevaluator.connections.metricweight=15
instanceevaluator.subscribers.metricweight=60
instanceevaluator.memory.metricweight=20
instanceevaluator.restreamer.metricweight=35

instanceevaluator.streams.metricweight

Refers to a weight value attached to maximum stream count percentage on origin (100). Weightage of the current metric value is evaluated at runtime dynamically. The stream count metric is internally evaluated in percentage. This helps select an origin with minimum broadcast streams.

instanceevaluator.connections.metricweight

Refers to a weight value attached to maximum connection count percentage on origin (100). Weightage of the current metric value is evaluated at runtime dynamically. This helps select an origin with minimum connections.

instanceevaluator.subscribers.metricweight

Refers to a weight value attached to the net subscribers percentage on all the edges combined for a an origin (100). Weightage of the current metric value is evaluated at runtime dynamically. This helps select an origin with minimum subscribers on edge.

instanceevaluator.memory.metricweight

Metric weight for jvm reported free memory for a node. This metric helps in selection of a best node by involving the freememory value in node-weight evaluation. A node with higher freememory is favorable.

instanceevaluator.restreamer.metricweight

Metric weight for restreamer count reported by the relay node in case of dynamic clustering. This metric helps in selection of a best relay node by involving the restreamerCount parameter in the node-weight evaluation. A restreamer count value depends on an active stream being subscribed to at the edge that is passing through the relay.

CLUSTER CONFIGURATION SECTION

cluster.password=changeme
cluster.publicPort=1935
cluster.accessPort=5080
cluster.reportingSpeed=10000
cluster.retryDuration=30
cluster.mode=auto
cluster.idleClusterPathThreshold=30000

cluster.password

The cluster password required for dynamic clustering of nodes..This property can be also be found in the {RED5_HOME}/conf/cluster.xml file and the values must match. The cluster password value defaults to changeme. Streammanager uses the password to authenticate before making cluster api calls to nodes.

cluster.publicPort

The public RTMP port used by the cluster nodes to communicate with each other internally..This property can be also be found in the {RED5_HOME}/conf/cluster.xml file and the values must match.

cluster.accessPort

The public HTTP port over which the Red5pro node can be accessed publically.This property can be also be found in the {RED5_HOME}/conf/cluster.xml file and the values must match.

cluster.reportingSpeed

The time-period in which a clustered node repeatedly dispatches a statistics report to the stream manager.The report contains the clustering relationship and load statistics information. The reportingSpeed parameter is internally used by streammanager to evaluate values for other properties.The value is expressed in milliseconds.This property can be also be found in the {RED5_HOME}/conf/cluster.xml file and the values must match.

cluster.retryDuration

The time-period in which a clustered child node tries to reconnect with its parent (if xconnectivity is lost).This property can be also be found in the {RED5_HOME}/conf/cluster.xml file and the values must match.

cluster.mode

Selects the clustering mode on Stream Manager. Currently stream manager will support two types of clustering modes - ondemand and auto. The ondemand mode creates connection between nodes only when there is a stream demand that requires a path to exist between them. The auto mode on the other hand automatically creates a cluster link between each parent-child node pair in the group. The default value is auto.

cluster.idleClusterPathThreshold

The maximum time threshold (in milliseconds) allowed for a cluster path to remain idle (without an active stream). This value is applicable for ondemand clustering only. If a cluster path remains connected for a time duration longer than the threshold without traffic, it will be disconnected automatically. The default value is 30000.


LOADBALANCING CONFIGURATION

streammanager.ip=

streammanager.ip

The IP address of the current stream manager instance. When deploying multiple stream managers with a load balancer, each instance should define its own IP here.This is an optional parameter for a single stream manager based deployment.

LOCATIONAWARE CONFIGURATION

location.region

Defines the current region of the Stream Manager. This parameter should be used when you wish to have a Stream Manager per region and that the Stream Manager should fetch edges in the specified region automatically.

Defined region name should be within the list of regions supported by your platform.

location.geozone

Geo zones are the next level in the hierarchy of location modeling in Stream Manager. Unless the cloud controller supports geozone information, the value of geozone should always be global. This tells the Stream Manager about its geozone location.

location.strict

The strict parameter ensures that the location aware configuration is honoured. If edge nodes are not found in the specified region,the request will surely fail if strict is set to true.

CLOUD CONTROLLER CONFIGURATION SECTION

GOOGLE COMPUTE CONFIGURATION SECTION

compute.project={project-id}
compute.defaultzone={zone-id}
compute.defaultdisk=pd-standard
compute.network=default
compute.operationTimeoutMilliseconds=200000

compute.project

Configures the Google Compute project id under your google cloud platform account. Your Google Compute resources are managed within the project scope. To know more about Google Cloud projects you can check out the official documentation online at: https://cloud.google.com/compute/docs/projects.

compute.defaultzone

Configures the default zone of your Google Cloud project. Every project on Google Cloud Platform are associated with a default zone. For more information on default zone you can check out the official Google page : https://cloud.google.com/compute/docs/projects#default_region_and_zone.

compute.defaultdisk

Configures the default diskType to use for the google compute instances. The value for this property must always remain as pd-standard.

compute.network

Configures the network to use for the google compute instances.The default vallue is default, since every GCP account is preconfigured with the default network.

compute.operationTimeoutMilliseconds

Configures the default timeout in milliseconds for cloud instance operations. A new cloud instance startup or termination failing to complete within the given period is considered as a failure.

SIMULATED-CLOUD CONTROLLER CONFIGURATION SECTION

managed.regionNames={custom-region}
managed.availabilityZoneNames={custom-region-zone}
managed.operationTimeoutMilliseconds=20000
managed.recycleDeadNodes=true

managed.regionNames

This attribute takes a comma separated list of region names. For managed instances we create our own region names which represent where the servers are located. The region name should be in the format similar to us-test1. You do not need to add more than one region.

managed.availabilityZoneNames

This attribute takes a comma separated list of zone names. For managed instances we create our own zone names which represent where the servers are located. The zone names should be in the format simlilar to us-test1-a. You do not need to add more than one zone.

managed.operationTimeoutMilliseconds

This attribute is common to all cloud platform controllers. It controls the max time allowed for a cloud operation. However in the context of SimulatedCloud, this attribute is instead used to set the simulated responseDelay in milliseconds. A responseDelay makes the SimulatedCloud, simulate an operation like a cloud operation.

managed.recycleDeadNodes

This attribute configured how the simulated cloud handles dead nodes. Normally when nodes are scaled down they are recycled automatically (ie: They are soft-reset & put back in availability list for reuse). A node that is being deleted because of failure is generally not a healthy node. Hence it is not a candidate for reuse by default. Setting this attribute to true, directs the controller to force reusability of the node. If the value is set to true, the failed node being deleted will be reused; if set to false then the node will not be reused.

AWS CLOUD CONTROLLER CONFIGURATION SECTION

aws.defaultzone={default-region}
aws.operationTimeoutMilliseconds={operation-timeout}
aws.accessKey = {account-accessKey}
aws.accessSecret = {account-accessSecret}
aws.ec2KeyPairName = {keyPairName}
aws.ec2SecurityGroup={securityGroupName}
aws.defaultVPC={boolean}
aws.defaultVPC={boolean}
aws.vpcName={vpc-name}
aws.faultZoneBlockMilliseconds=3600000
aws.forUsGovRegions=false

aws.defaultzone

A default availability zone in a preferred region.this works as a fallback launch location for your instances if automatic availability zone evaluation fails in an occasion

aws.operationTimeoutMilliseconds

Maximum time allowed for completing a cloud instance operation before the operation times-out assuming a failure. Recommended value is 120000. The unit is milliseconds.

aws.accessKey

Your accessKey from the aws account credentials that you created earlier. [It is recommended to use IAM credentials instead of root ones.]

aws.accessSecret

Your accessSecret from the aws account credentials that you created earlier. [It is recommended to use IAM credentials instead of root ones.]

aws.ec2KeyPairName

Name of the public key you imported into your aws ec2 dashboard under “Key Pairs”. [Your key should be imported into every region that you wish to use.]

aws.ec2SecurityGroup

Name of security group you created earlier in your ec2 dashboard under Security Groups. [Your security group should be available in every region that you wish to use for launching an instance.]

aws.defaultVPC

Indicates whether the security group name mentioned in the ”aws.ec2SecurityGroup” parameter is associated with a default (true) or non-default (false) VPC. AWS platform has different requirements for launching an instance in a default VPc vs a non-default VPC. Hence it is important to indicate what type of VPC you are using.

aws.vpcName

The name of your VPC (needs to be the same name for each region that you are using for autoscaling.

aws.faultZoneBlockMilliseconds

Defines the time in milliseconds to avoid a specific availability zone for, when a launch fails due to the reqyested instance type not being available in that region at runtime.

aws.forUsGovRegions

Boolean value indicating whether the controller targets US Gov regions or standard ones. Set to true if targeting US Gov regions, otherwise false.

For more information check out the US Gov Support Notes.

AZURE CLOUD CONTROLLER CONFIGURATION SECTION

az.resourceGroupName={master-resourcegroup}
az.resourceGroupRegion={master-resourcegroup-region}
az.resourceNamePrefix={resource-name-prefix}
az.clientId={azure-ad-application-id}
az.clientKey={azure-ad-application-key}
az.tenantId={azure-ad-id}
az.subscriptionId={azure-ad-subscription-id}
az.vmUsername=ubuntu
az.vmPassword={password-to-set-for-dynamic-instances}
az.defaultSubnetName=default
az.operationTimeoutMilliseconds=120000
az.quickOperationResponse=true
az.quickResponseCheckInitialDelay=20000
az.apiLogLevel=BASIC

az.resourceGroupName

The name of the master resource group used for managing autoscaling resources.

az.resourceGroupRegion

The default region of the master resource group.

az.resourceNamePrefix

Resource name prefix name used to resolve resources in the resource group.

az.clientId

The application's id which is generated after the web app is registered in the azure AD.

az.tenantId

The azure active directory Id.

az.subscriptionId

The id of the active subscription which will be used for managing autoscaling resources.

az.vmUsername

The username that is generated for ssh access on a dynamic autoscaled instance. Defaults to ubuntu.

az.vmPassword

The password that is generated for ssh access on a dynamic autoscaled instance.

az.defaultSubnetName

The name of the default subnet of a Virtual Network which is used by Stream Manager to launch virtual machines. Defaults to default and should not be changed.

az.operationTimeoutMilliseconds

Maximum time allowed for completing a cloud instance operation before the operation times-out assuming a failure. Recommended value is 120000. The unit is milliseconds.

This attribute currently has no useful role in the azure controller. Howe ever this is retained to comply with controller properties standardization. The value need not be changed.

az.quickOperationResponse

Enables the quick response mode of the azure controller which helps speedup up delete operations. With this flag turned on, the controller will not wait for the entire delete to complete. Rather when it sees that the instance in a Deleting state, it acknowledges a successful delete operation. Defaults to true.

az.quickResponseCheckInitialDelay

The initial wait time before checking for the instance state when running in quick response mode.

az.apiLogLevel

The log level of the azure SDK. Defaults to BASIC.

REST SECURITY SECTION

rest.administratorToken=

rest.administratorToken

Configures the administrator's security token for making administrative REST API calls. This protects your REST gateway from unauthorized use.

RED5PRO NODE SERVER API SECTION

serverapi.port=5080
serverapi.protocol=http
serverapi.version=v1
serverapi.accessToken=

serverapi.port

The http/https port that the red5pro API on the remote node expects connection on. Defaults to 5080.

serverapi.protocol

The protocol (http/https) that the red5pro API on the remote node expects connection on. Defaults to http.

serverapi.version

The Red5 Pro server API version string. Defaults to v1.

serverapi.accessToken

The administrator access token for the Red5 Pro server API web application as defined in its red5-web.properties file.

DEBUGGING CONFIGURATION SECTION

debug.logaccess=false
debug.logcachexpiretime=60000

debug.logaccess

Configures Stream Manager to allows or deny access to logs over REST API. A boolean true implies that access is allowed whereas false implies that access is denied.

debug.logcachexpiretime

The expire time (in milliseconds) of a temporary generated log path on the server.

WEBSOCKET PROXY SECTION

proxy.enabled=false

proxy.enabled

Enabled or disables WebSocket proxy on Stream Manager. A boolean true implies that proxy is enabled and a false implies that it is disabled. Defaults to false.


Setting Up A New Cluster for Streaming

To start streaming operations you need to have a minimum of one active cluster (NodeGroup). As described before a cluster will need to have a minimum of one edge and one origin at any given time for a conducting a successful streaming event.

To create a new cluster you need to have access to Stream Manager’s REST API gateway and the Red5 Pro Stream Manager API.

Following are the steps to setup a new cluster (NodeGroup) of Red5 Pro instances.

1. Determine an approximate of your streaming traffic

Before you start creating a cluster, you need to have an approximate figure of what your estimated traffic will be like. Going a step further you should divide you traffic requirement into two types:

  • Base Traffic: Base triffic is the minimum traffic you expect on your setup. The Nodegroup should be able cater to the minimum traffic expected at all times, without needing to scaleup. The base traffic for publishers and subscribers should be evaluated separately.
  • Peak Traffic: Peak traffic is the maximum traffic you expect on your setup. The Nodegroup should be able cater to the maximum traffic expected using autoscaling. The peak traffic for publishers and subscribers should be evaluated separately. As soon as the traffic goes down, autoscaling will remove added nodes automatically to go back to Base Traffic capacity.

Once you have sufficient statistics/idea about your traffic needs you are ready to start creating the nodegroup.

1. Create a launch configuration

As described earlier in this document, a launch configuration file helps in launching a new instance on the target platform. Once you have an idea of the expected Base Traffic and Peak Trafficfor your rig, you can create your launch configuration appropriately.

The launch configuration requires you to specify the prefered instanceType (machine types defined by your platform) and an estimated connectionCapacity for each node type that will be present in your setup.

Each instance type comes with a certain specific configuration as defined by your cloud platform. Red5 Pro supports 4 types of clients based on protocol used for media delivery - RTMP, RTSP, WebRTC and HLS (subscription only). Each of the following client type exerts a different amount of load on the server's CPU, RAM etc. Therefore it is very important to choose the proper instanceType (depending of your expected traffic type, stream quality etc) and configure it with appropriate estimated connectionCapacity in the launch configuration for best results.

Selecting a incorrect instanceType or configuring a improper estimated connectionCapacity may lead to instance crash/failure due to overload or wastage due to underutilization. It is also important to note that the value of connectionCapacity influences auto scaling since it states the maximum capacity of a node.

The estimated connectionCapacity may also widely varies with the node role ie: origin/edge/relay/transcoder.

It is recommended that you contact support for advice on the best instanceType to use and the appropriate estimated connectionCapacity for it based on your streaming needs.

Once you know the best instanceType to use per node type and the estimated connectionCapacity for it, you can create the launch configuration for your nodegroup. To create a launch configuration please check out Stream Manager API for launch configuration management.

2. Create a scale policy

Similar to the launch configuration, the scale policy too plays a pivotal role in guiding autoscaling to meet your traffic needs.

A scale policy requires you to specify the minLimit and the maxLimit per node type that will be used in your setup. The minLimit helps target the Base Traffic, whereas the maxLimit helps with the Peak Traffic.

Determining minLimit and maxLimit for a node type

Consider an example where you estimate that you are going to have a minimum of 900 publishers and your instanceType used for the role of origin has an estimated connectionCapacity of 500, your minLimit for the node type origin should be set to 2. Similarly if you estimate a maximum of 1400 publishers atmost, your maxLimit for the node type origin should be set to 3.

It is always to recommended to have a higher than expected value for maxLimit specified in your scale policy as it will help autoscaling deal with higher than anticipated Peak Traffic effortlessly.

Once you know the best minLimitand maxLimit required per node type, you can create the scale policy for your nodegroup. To create a scale policy please check out Stream Manager API for scale policy management.

3. Create a new NodeGroup

Once you have created a launch configuration and a scale policy for your group, the next step is to create the group itself.

Use the CREATE GROUP REST API call to create a new virtual group name (placeholder) which will define a collection of origins and edges and optionally relays and transcoders as per your launch configuration & scale policy data.Note the group name from the rest response received.

Every group operation requires the group name as the primary parameter.

3. Launch a new origin in the Node Group

Use the LAUNCH NEW ORIGIN REST API call to start a new origin instance in the newly created group created in step 1.

  • The instance normally takes less than 2 mins to be completely initialized and active. This includes time needed for the machine to startup and and time required for Red5 Pro service to be running. (You can optimize the startup time by removing unwanted applications before creating the image).

After the origin is ready it pings Stream Manager over HTTP/HTTPS to discover its Role and acquire initial configuration data. Once the first origin has been initialized, Stream Manager check to see the edge/relay/transcoder requirements using the current state of the group and the scale policy. It then sequentially adds additional nodes to the group as per the scale policy definition using the launch configuration as a guide for launching instances.

Each node upon a successful startup pings Stream Manager over HTTP/HTTPS to discover its Role and acquire initial configuration data. You can start using services as soon as you have a minimum of one origin and one edge in the group. Stream Manager will add more nodes to the group using Autoscaler when it sees the traffic load increasing beyond the configured Threshold or if a node is removed due to failure.

Transcoder nodes although a part of the nodegroup are not a part of the cluster relationship.


Consuming Services As a Broadcaster

Once you have an active cluster for streaming the next step is for a publisher client to start broadcasting to it.

The problem here is that the broadcaster may not know the address of the origin server that they need to broadcast stream to. This is where Stream Manager comes into play again.

Broadcaster Stream Request

Stream Manager provides a public REST endpoint for providing stream information to broadcasters and subscribers based on requester's role in the system. The broadcaster client will need to make a REST call to Stream Manager using a scope name, stream name and the action parameter (broadcast).

  • Scope name is the Red5 application/context name that a client connects to. The default scope name to be used is live
  • Stream name is the name of the publishing stream
  • Action is a query string parameter which defines the client's request type: broadcast or subscribe

Requesting Stream Broadcast Information

Use stream manager’s READ STREAM REST API to request for an origin server

The combination of scope name and stream name should be unique for each streaming event.

Request Format

http(s)://{host}:{port}/streammanager/api/<api-version>/event/{scopeName}/{streamName}?action=broadcast

Response Format

{
  "name": "<stream-name>",
  "scope": "<stream-scope>",
  "serverAddress": "<origin-host-address>",
  "region": "<region-code>"
}

Connecting To Server To Broadcast

Having received a successful response from stream manager with stream broadcast information we can now publish the stream using a subscriber client.

The stream publish information received from the REST response can be used by different types of publisher client(s) as shown below.


Android Client

Property Name Value
host
context
port 8554
stream

iOS Client

Property Name Value
host
context
port 8554
stream

Flash Client

Property Name Value
Connection URL rtmp://:1935/
stream

WebRTC Stream Manager Proxy Publishing

WebRTC requires using the Stream Manager proxy feature to bypass the secure origin requirement, to be able to publish/subscribe.

Stream Manager can multiplex between its regular responsibilities as well as as being a proxy for WebRTC publish/subscribe operations.Thus for WebRTC clients the application is streammanager instead of any other Red5 Pro webapp.

To know more about proxy and configuring WebRTC client to work with proxy please see the Stream Manager Proxy Guide.


Consuming Services As a Subscriber

If you are coming from the previous section then you have a working Red5 Pro cluster in place with at least one stream publishing to it.

Now that the stream is publishing and available subscriber clients may be interested to subscribe to it. However they don't know the host address of the edge that they should subscribe to.

Consuming Services As A Subscriber

Once again Stream Manager comes to the rescue by providing stream information to the subscriber client via REST API call. The subscriber client will need to make a READ STREAM REST call to stream Manager using a scope name, stream name and the action parameter (subscribe).

  • Scope name is the Red5 application/context name that a client connects to. The default scope name to be used is live
  • Stream name is the name of the publishing stream
  • Action is a query string parameter which defines the client's request type: broadcast or subscribe

Requesting Stream Subscribe Information

Use stream manager's READ STREAM REST API to request for an edge server.

The combination of scope name and stream name should be unique for each streaming event.

Request Format

http(s)://{host}:{port}/streammanager/api/<api-version>/event/{scopeName}/{streamName}?action=subscribe

Response Format

{
  "name": "<stream-name>",
  "scope": "<stream-scope>",
  "serverAddress": "<edge-host-address>",
  "region": "<region-code>"
}

Connecting To Server To Subscribe

Having received a successful response from Stream Manager with stream subscribe information we can now consume the stream using a subscriber client.

The stream subscribe information received from the REST response can be used by different types of subscribing client(s) as shown below.


Android Client

Property Name Value
host
context
port 8554
stream

iOS Client

Property Name Value
host
context
port 8554
stream

Flash Client

Property Name Value
Connection URL rtmp://:1935/
stream

WebRTC Stream Manager Proxy Subscribing

WebRTC requires using the Stream Manager proxy feature to bypass the secure origin requirement, to be able to publish.

To learn more about proxy and configuring WebRTC client to work with proxy please see the Stream Manager Proxy Guide.


Notes

  • You cannot make a subscribe request for a stream if it is not publishing
  • If a stream stops publishing the subscribers must re-query the Stream Manager for stream details