
The Remote Integration Model (REMI), sometimes called “at-home production” in Anglo-Saxon literature, refers to a broadcast workflow in which video and audio capture takes place on site — sports set, conference room, field — while the entire production (switching, sound mixing, sound mixing, graphics, production) is carried out remotely, in a centralized production center or in the cloud.
This model is radically opposed to the traditional OB (Outside Broadcast) model, in which a complete control car — mixer, graphics, sound, monitoring, technical team — is transported and deployed on site for each event. REMI reverses the logic: we transport the pixels, not the people or the equipment.
Three convergences have made REMI viable on a large scale: the trivialization of high-capacity IP networks (fiber, 5G), the standardization of the SRT protocol for secure contributions on the public Internet, and the availability of high-performance software switches running on standard cloud infrastructure. Before these three elements combined, REMI was reserved for a few major players with dedicated fiber links.
Understanding REMI means understanding the signal path through seven distinct layers. Each layer introduces latency, dependency, and quality constraints that must be controlled.
2.1 — Layer 1: the capture
It all starts with the cameras. In a professional REMI environment, video sources are standard broadcast cameras (Sony, Ikegami, Grass Valley LDX) that deliver an SDI (Serial Digital Interface) signal in 3G-SDI (1080i/1080p) or 12G-SDI (4K). This uncompressed SDI signal represents around 1.5 Gbps for 1080i — a speed totally incompatible with IP transport on the public Internet.
At this stage, field encoders come into play, which are the first critical link in the REMI chain.
2.2 — Layer 2: terrain encoding
The hardware encoder is the most strategic component of REMI terrain. Its role is twofold: to compress the SDI signal into a transportable IP stream, and to encapsulate it in a transport protocol adapted to the available network.
Hardware vs software encoders
A clear distinction should be made between two categories:
Critical point
For any mission-critical broadcast production — television news, live sports, institutional events — the hardware encoder remains the essential standard. The difference is not visible in the flow itself, it is evident during network incidents or variations in CPU load: the hardware encoder continues, the software gets carried away.
Codecs and profiles
H.264 (AVC) remains the most common in REMI contribution: excellent hardware support, low encoding latency (“Baseline” or “Low Latency” mode), bitrates of 5 to 50 Mbps depending on the target quality. In 1080p50, a 4:2:0 H.264 broadcast-grade encoding is around 15—25 Mbps; in 4:2:2 10-bit, count 30—50 Mbps.
HEVC (H.265) offers superior compression at the same quality — typically 40% lower bitrate compared to H.264 — making it interesting for 5G links or connections with limited bandwidth. On the other hand, the encoding latency is slightly higher and the hardware processing requirements are higher.
2.3 — Layer 3: IP transport
This is where protocols come in, and it's often the least understood part of the REMI architecture. Three protocols coexist in the current ecosystem, with very distinct uses.
SRT — Secure Reliable Transport
Developed by Haivision and released into open source in 2017, SRT has become the reference protocol for REMI contributions on the public internet. Its fundamental principle: borrow the error correction mechanisms of QUIC/UDP while adding AES-256 encryption and a system for the selective retransmission of lost packets (ARQ — Automatic Repeat ReRequest).
SRT operates in caller/listener mode or in appointment mode. Latency is configurable via the “latency” parameter, which defines the buffer window in which retransmission can take place: the higher the SRT latency (e.g. 2,000 ms), the more unstable the network can be without packet loss; the lower it is (e.g. 120 ms), the higher the network link must be.
Empirical rule
The SRT latency should be set to at least 2.5 times the RTT (Round Trip Time) of the network link. On a link with an RTT of 80 ms (Paris—Lyon for example), an SRT latency of 200—300 ms is a reasonable minimum for artifact-free production. On intercontinental links (RTT > 150 ms), provide 500—800 ms of SRT buffer.
NDI — Network Device Interface
Developed by NewTek (now Vizrt Group), NDI is a LAN protocol designed for high quality video distribution over a local network. NDI uses TCP/IP and offers exceptional image quality (close to uncompressed in NDI HX3) but is not adapted to transport on the public Internet: it has no error correction mechanism on a WAN network and its bandwidth can reach several hundred Mbps at maximum quality.
NDI finds its place in REMI for internal distribution at the production center: streams received in SRT are decoded and redistributed in NDI to production stations, Vizrt graphics stations, and software multiviewers. It's a studio protocol, no field contribution.
SMPTE IS 2110
ST 2110 is the professional broadcast standard for uncompressed video/audio distribution over IP. It carries video (ST 2110-20), audio (ST 2110-30/31 AES67), and metadata over dedicated 10 or 25 GbE networks with PTP precision (IEEE 1588). This is the standard for large broadcast agencies that have migrated from SDI infrastructure to IP.
In a REMI context, ST 2110 is not used for field transport (too demanding in terms of bandwidth and network infrastructure). On the other hand, once the SRT streams are received at the production center, they are often decoded and redistributed in ST 2110 to feed a hardware broadcast switcher — Grass Valley K-Frame, Sony XVS — which operates natively on this standard.
RTMP — Real-Time Messaging Protocol
Originally developed by Macromedia for Flash streaming, RTMP (Real-Time Messaging Protocol) is now omnipresent in distribution workflows to broadcast platforms — YouTube Live, Twitch, Facebook Live, LinkedIn Live — as well as to OTT broadcast encoders. It relies on TCP, which guarantees the orderly delivery of packets but at the cost of a latency that is structurally higher than SRT: from 1 to 5 seconds under normal conditions, without a native adaptive error correction mechanism.
In a REMI architecture, RTMP does not (or does little) affect the field contribution — it is too fragile on unstable networks and does not support native AES encryption. Its role is mainly that of an output protocol, at the end of the chain: once the program is produced at the production center, it is encoded one last time (often in H.264, Baseline or Main profile) and pushed using RTMP to one or more simultaneous broadcast destinations. This RTMP outgoing stream is distinct from the inbound contribution flow in SRT — confusing them is a common architectural error for less experienced teams.
2.4 — Layer 4: the gateway and the ingest cloud
SRT field streams arrive in a receiving system called a gateway or media server. In cloud architectures, this component is often an instance of Haivision StreamHub, AWS Elemental MediaConnect but also BeNarative running on cloud infrastructure (AWS, Azure, GCP or Akamai).
The gateway has several responsibilities: decoding the incoming SRT stream, verifying the integrity of the signal (detecting sync loss, monitoring audio levels), and redistributing the signal to production components — the cloud switcher, the graphics servers, the recording systems.
This is also where the conversion between protocols takes place: an SRT stream can be converted into RTMP to feed a streaming platform, into HLS for OTT broadcasting, or into NDI for a software control room.
2.5 — Layer 5: switching and production
The switcher is the heart of REMI production. Two main families coexist depending on the level of requirement.
Hardware switches with remote access
Manufacturers like Grass Valley (AMPP with K-Frame), Ross Video (Ultrix) or Sony have developed architectures where the switching engine remains dedicated hardware (or an optimized server), but the control surface can be remotely deported via IP. The operator works from home or another site on a physical surface connected via a network to the remote switching frame. This approach preserves all the qualities of the hardware switcher: switching latency below the frame, broadcast-grade keyers, management of native SDI signals.
Cloud Native Switchers
Solutions like TVU Producer Grabyo or BenaNative offer a switching engine entirely in the cloud, operated via a web browser (and tablet and iPhone for BenaNative). These solutions are lighter to deploy but require additional preview latency.
2.6 — Layer 6: the return tray (IFB)
The IFB (Interruptible Foldback) is the return audio signal sent to the earpiece of the presenter or journalist in the field. It is often the overlooked link in REMI — and yet one of the most critical for talent in the field.
In a REMI workflow, the IFB takes the opposite path of the programmed signal: the sound mixer at the production center sends a mix back to the field via an IP audio stream. The total IFB latency is the sum of the forward latency (field → production) and the return latency (production → field).
2.7 — Layer 7: Synchronization and Genlock
Multi-source synchronization is REMI's most underrated technical challenge. In a classic SDI control room, all sources are synchronized via an SPG (Sync Pulse Generator) that distributes a Blackburst or Tri-Level Sync reference signal to all equipment. Hardware encoders accept this genlock signal, ensuring that all cameras are phase-aligned: switching between two sources is like switching between two perfectly synchronous signals, without switching artifacts.
In a REMI workflow, this synchronization is considerably complicated. Each SRT stream arrives at the production center with variable and non-deterministic latency. The gateway must therefore perform a reclocking: it reconstructs a reference timing and resynchronizes all incoming flows before presenting them to the switcher. Depending on the quality of the network link, this operation may introduce timing variations (residual jitter) which result in micro-artifacts during switching.
PTP — IEEE 1588
The most demanding REMI architectures use the Precision Time Protocol (PTP/IEEE 1588) to distribute a common time reference between the field and the production center via the IP network. PTP-compatible encoders (Matrox Monarch EDGE, some Haivision Makito) can thus maintain sub-millisecond synchronization even on long-distance networks — a major advance for sports production.
The concept of latency budget is central to the design of a REMI workflow. Each component of the chain contributes to the total delay between action in the field and switching to control. This cumulative delay is what allows — or prevents — certain types of production.

These figures have concrete operational implications:

5.1 — Network redundancy
A professional REMI workflow cannot rely on a single network link. The standard practice is bonding or link redundancy: combining several physical connections (fiber, 4G, 5G, satellite) to guarantee the continuity of the signal even in the event of a loss of a link. Solutions like Haivision LiveU, or TVU Pack implement this bonding with real-time network quality management algorithms.
5.2 — Multi-track audio
Audio management in REMI is often more complex than video management. In SDI, the audio tracks are embedded in the video signal (SDI embedded audio, up to 16 channels in 3G-SDI). In IP, audio can be transported embedded in the H.264/HEVC stream, or separately in AES67 (ST 2110-30) for the most demanding architectures. Lip-sync synchronization is a constant concern: each decoding/re-encoding step can introduce an audio/video misalignment that needs to be monitored and corrected.
5.3 — The monitoring signal
In physical control, monitoring is omnipresent and immediate: monitor walls, waveform, vectorscope. In REMI, remote monitoring suffers from the same latency as the production signal. It is imperative to set up monitoring as close as possible to the ingest — ideally at the gateway level, before any switching — to detect signal problems before they reach the antenna.
5.4 — The tally system
Tally is the signal that tells cameramen which camera is on the antenna (red light = live). In REMI, this signal must cross the chain in the opposite direction to the video signal, with the constraint that its latency must be consistent with the production latency for the cameraman to see his light turn on at the same time that his camera is actually switched to the antenna.
5.5 — Security and Encryption
Transporting live content that has not yet been broadcast on the public internet is a real security issue. SRT with AES-256 encryption is the minimum requirement. For very sensitive productions (previews, institutional events).
REMI is not a technology — it is an architecture. Its performance is not determined by its most sophisticated component but by its weakest link. A last-generation hardware encoder can't do anything if the SRT link is poorly sized. A powerful cloud switcher won't fix an upstream genlock defect.
The right approach is to reason from end to end, by documenting the latency budget of each component, by testing multi-source synchronization under representative network conditions, and by sizing redundancy according to the level of criticality of production.
The next article in this series addresses the economic question: how to calculate the TCO of a REMI workflow and at what volume threshold it becomes financially justifiable compared to a traditional physical management system.