Support for OS X was written several months ago, by defining a new sound card type in mediastreamer (a third-party framework included in libjingle), making use of the cross-platform PortAudio API to do interaction with the audio hardware. We never tested it, so we assumed it didn’t work. Fortunately, a quick test proved that wrong ! So now we are down to one unsupported platform: Windows. Because PortAudio is cross-platform, it should in theory work on other platforms as well, including Windows.
音の再生はこんな
WindowsのVisual C# 2008 Express EditionのテンプレートからつくったWindowsフォームアプリケーション。
[csharp]santrini% cat Program.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;
namespace sound
{
static class Program
{
/// /// アプリケーションのメイン エントリ ポイントです。
///
[STAThread]
static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run(new Form1());
}
}
}
santrini% cat Form1.cs
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using System.Text;
using System.Windows.Forms;
namespace sound
{
public partial class Form1 : Form
{
public Form1()
{
WavPlayer player = new WavPlayer();
//player.play(“/home/mo/tmp/csharp/sound/1.wav”);
player.play(“/home/mo/tmp/csharp/sound/1.mp3″);
}
}
}
santrini% cat WavPlayer.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Media;
namespace sound
{
class WavPlayer
{
public WavPlayer()
{
}
public void play(string wavfile)
{
SoundPlayer player = new SoundPlayer(wavfile);
player.Play();
}
}
}
santrini% gmcs Program.cs Form1.cs WavPlayer.cs -r:System.Windows.Forms.dll
santrini% export DISPLAY=localhost:51.0; mono Program.exe
[/csharp]
これはありだなぁ
libjingle was created at about the same time as the Jingle XMPP extension (XEP-0166). The libjingle team created their own protocol to handle session negotiation, and later worked with the XMPP Standards Foundation to standardize Jingle; thus, although the libjingle protocol and Jingle are very similar, they are not the same, and are not interoperable.
libjingleは前回までみてきたjingle specとは非互換だしjingleの方は実装してない、さらにlibjingleの独自実装は(たぶん探しても見つからないので)仕様非公開。
いちいち難所がありますな><
very similarに期待して進みます。
This specification defines a Jingle transport method that results in sending media data using raw datagram sockets via the User Datagram Protocol (UDP). This transport method is negotiated via the Interactive Connectivity Establishment (ICE) methodology defined by the IETF and thus provides robust NAT traversal for media traffic.
This specification defines a Jingle transport method that results in sending media data using raw datagram sockets via the User Datagram Protocol (UDP). This simple transport method does not provide NAT traversal, and the ICE-UDP transport method should be used if NAT traversal is required.
NAT越えが必要なら前者。後で後者も読む。
The current document defines a transport method for establishing and managing data exchanges between XMPP entities over the User Datagram Protocol (see RFC 768 [2]), using the ICE methodology developed within the IETF and specified in Interactive Connectivity Establishment (ICE) [3] (hereafter referred to as ICE-CORE). Use of the ice-udp method results in a lossy transport suitable for media applications where some packet loss is tolerable (e.g., audio and video).
ICEとの違い
・signaling channelにSIPではなく、XMPPを使用する
・ICEでは、全てのtransport候補をいっぺんに送るところ、個別に送ってもいい。XMPPのによるリクエスト/レスポンス機構を利用することで優先度の高いtransportを早く伝えられるので、negotiationが速い。
・SDPの文法はXMPPのxmlにマッピングされている(前回のpayload typeのところかな)
・ICE candidates can be upgraded during a session (e.g., to change an IP address)
IPアドレスの変更に耐えられるってこと?mobilityのサポートってこと?なんだろこれ
・どちらのxmpp entityでも、セッション中にいつでもnegotiationをやり直せる
Abstract
This document describes a protocol for Network Address Translator
(NAT) traversal for UDP-based multimedia sessions established with
the offer/answer model. This protocol is called Interactive
Connectivity Establishment (ICE). ICE makes use of the Session
Traversal Utilities for NAT (STUN) protocol and its extension,
Traversal Using Relay NAT (TURN). ICE can be used by any protocol
utilizing the offer/answer model, such as the Session Initiation
Protocol (SIP).
2.5. Security for Checks
Each STUN
connectivity check is covered by a message authentication code (MAC)
computed using a key exchanged in the signalling channel. This MAC
provides message integrity and data origin authentication, thus
stopping an attacker from forging or modifying connectivity check
messages.
More fundamentally, however, the prioritization
defined by this specification may not yield “optimal” results. As an
example, if the aim is to select low latency media paths, usage of a
relay is a hint that latencies may be higher, but it is nothing more
than a hint. An actual RTT measurement could be made, and it might
demonstrate that a pair with lower priority is actually better than
one with higher priority.
Naturally, more complex scenarios are probable; such scenarios are described in other specifications, such as XEP-0167 for voice chat.
ってのがあるのでポインタの先に飛んでみる
XEP-0167: Jingle RTP Sessions
This specification defines a Jingle application type for negotiating one or more sessions that use the Real-time Transport Protocol (RTP) to exchange media such as voice or video. The application type includes a straightforward mapping to Session Description Protocol (SDP) for interworking with SIP media endpoints.
音声や映像などを交換するためにRTPを使用する、1つ以上のセッションのnegotiationをするためのapplication typeについて定義する。
application typeはSIPと互換性をもてるよう、SDPと1対1対応するマッピングを含む。
RTP
provides end-to-end network transport functions suitable for
applications transmitting real-time data, such as audio, video or
simulation data, over multicast or unicast network services. RTP
does not address resource reservation and does not guarantee
quality-of-service for real-time services. The data transport is
augmented by a control protocol (RTCP) to allow monitoring of the
data delivery in a manner scalable to large multicast networks, and
to provide minimal control and identification functionality. RTP and
RTCP are designed to be independent of the underlying transport and
network layers. The protocol supports the use of RTP-level
translators and mixers.
provides end-to-end delivery services for data with real-time
characteristics, such as interactive audio and video. Those services
include payload type identification, sequence numbering, timestamping
and delivery monitoring.
payloadの定義と、シーケンス番号付けと、時間情報の付加、モニタリング機能。
おいおいRTPのRFC104ページもあって無理。適当に抜粋しよう
While RTP is primarily designed to satisfy the needs of multi-
participant multimedia conferences, it is not limited to that
RTP packet: A data packet consisting of the fixed RTP header, a
possibly empty list of contributing sources (see below), and the
payload data. Some underlying protocols may require an
encapsulation of the RTP packet to be defined. Typically one
packet of the underlying protocol contains a single RTP packet,
but several RTP packets MAY be contained if permitted by the
encapsulation method (see Section 11).
Synchronization source (SSRC): The source of a stream of RTP
packets, identified by a 32-bit numeric SSRC identifier carried in
the RTP header so as not to be dependent upon the network address.
All packets from a synchronization source form part of the same
timing and sequence number space, so a receiver groups packets by
synchronization source for playback. Examples of synchronization
sources include the sender of a stream of packets derived from a
signal source such as a microphone or a camera, or an RTP mixer
(see below). A synchronization source may change its data format,
e.g., audio encoding, over time. The SSRC identifier is a
randomly chosen value meant to be globally unique within a
particular RTP session (see Section 8). A participant need not
use the same SSRC identifier for all the RTP sessions in a
multimedia session; the binding of the SSRC identifiers is
provided through RTCP (see Section 6.5.1). If a participant
generates multiple streams in one RTP session, for example from
separate video cameras, each MUST be identified as a different
SSRC.
Contributing source (CSRC): A source of a stream of RTP packets
that has contributed to the combined stream produced by an RTP
mixer (see below). The mixer inserts a list of the SSRC
identifiers of the sources that contributed to the generation of a
particular packet into the RTP header of that packet. This list
is called the CSRC list. An example application is audio
conferencing where a mixer indicates all the talkers whose speech
was combined to produce the outgoing packet, allowing the receiver
to indicate the current talker, even though all the audio packets
contain the same SSRC identifier (that of the mixer).
This document describes a profile called “RTP/AVP” for the use of the
real-time transport protocol (RTP), version 2, and the associated
control protocol, RTCP, within audio and video multiparticipant
conferences with minimal control. It provides interpretations of
generic fields within the RTP specification suitable for audio and
video conferences. In particular, this document defines a set of
default mappings from payload type numbers to encodings.
This document also describes how audio and video data may be carried
within RTP. It defines a set of standard encodings and their names
when used within RTP. The descriptions provide pointers to reference
implementations and the detailed standards. This document is meant
as an aid for implementors of audio, video and other real-time
multimedia applications.
RTPバージョン2とRTCPを使って、多人数参加の音声、映像会議を最低限の管理で行うためのprofile:”RTP/AVP”について記述する。
さらに、音声と映像データがどうRTPの中で転送されるか、説明する。
thanks for the aid!!
RTPパケットの、PTに当たるところの表。
1234567891011121314151617
PT encoding media type clock rate channels
name (Hz)
\___\___\___\___\___\___\___\___\___\___\___\___\___\___\___\___\___
0 PCMU A 8,000 1
1 reserved A
2 reserved A
3 GSM A 8,000 1
4 G723 A 8,000 1
5 DVI4 A 8,000 1
6 DVI4 A 16,000 1
7 LPC A 8,000 1
8 PCMA A 8,000 1
9 G722 A 8,000 1
10 L16 A 44,100 2
11 L16 A 44,100 1
12 QCELP A 8,000 1
続く。。。
This document defines the Session Description Protocol, SDP. SDP is
intended for describing multimedia sessions for the purposes of
session announcement, session invitation, and other forms of
multimedia session initiation.
SDP includes:
o Session name and purpose
o Time(s) the session is active
o The media comprising the session
o Information to receive those media (addresses, ports, formats and
so on)
As resources necessary to participate in a session may be limited,
some additional information may also be desirable:
o Information about the bandwidth to be used by the conference
o Contact information for the person responsible for the session
Session description
v= (protocol version)
o= (owner/creator and session identifier).
s= (session name)
i= (session information)
u= (URI of description)
e= (email address)
p= (phone number)
c= (connection information – not required if included in all media)
b= (bandwidth information)
One or more time descriptions (see below)
z= (time zone adjustments)
k= (encryption key)
a=* (zero or more session attribute lines)
Zero or more media descriptions (see below)
Time description
t= (time the session is active)
r=* (zero or more repeat times)
Media description
m= (media name and transport address)
i= (media title)
c= (connection information – optional if included at session-level)
b= (bandwidth information)
k= (encryption key)
a=* (zero or more media attribute lines)
XEP-0166: Jingle
XMPP protocol extension for initiating and managing peer-to-peer media sessions between two XMPP entities in a way that is interoperable with existing Internet standards
既存のInternet標準と共存できる方法で、2つのXMPPentities間でP2Pメディアセッションを確立、管理するためのXMPPプロトコルの拡張。
In essence, Jingle enables two XMPP entities (e.g., romeo@montague.lit and juliet@capulet.lit) to set up, manage, and tear down a multimedia session. The negotiation takes place over XMPP, and the media transfer takes place outside of XMPP. The simplest session flow is as follows:
セッションのnegotiationはXMPPで、メディアはXMPP外で。
**After successful transport negotiation (not shown here),** the responder accepts the session by sending a session-accept action to the initiator.
Example 3. Responder definitively accepts the session