Sonde Health API Platform Documentation

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

 The following are the specifications for the audio capture format.


Property

Detail

Expected by SP

'RIFF'

RIFF file identification

‘RIFF’

'WAVE'

File Type Header. For our purposes, it always equals “WAVE”.

‘WAVE’

fmt marker

format sub-chunk identification OR Format chunk marker.

'fmt'

fmt size

length of format data or Sub-Chunk size

16

Audio format

format specifier (Type of format (1 is PCM) - 2 byte integer)

1

channels

number of channels

1

sample rate

sample rate in kHz

16 / 44.1 / 48

bit depth

bit depth (Bits per sample)

16

Summary 16000 :

RIFF (little-endian) with 'fmt' marker and fmt-size = 16, WAVE audio,16 bit-Microsoft PCM, mono, 16000 Hz

Summary 44100 :

RIFF (little-endian) with 'fmt' marker and fmt-size = 16, WAVE audio,16 bit-Microsoft PCM, mono, 44100 Hz

Summary 48000 :

RIFF (little-endian) with 'fmt' marker and fmt-size = 16, WAVE audio,16 bit-Microsoft PCM, mono, 48000 Hz

Total 44-byte header vital details : Here Sample-rate would be different as per audio recording sampling rate

  1. Marks of the file or chunk-id = 'RIFF'

  2. File type header = 'WAVE'

  3. Format chunk marker or sub chunk-id1 = 'fmt'

  4. Length of format or sub chunk size = 16

  5. Type of format or audio format = 1 (PCM)

  6. Number of channels = 1 (MONO)

  7. Sample-rate = 44100

  8. Bits per sample = 16

  9. Data chunk header or sub chunk-id2 = 'data'  


Example log

fmt_size                = 16
header_size             = 550948
format                  = 1
channels                = 1
sample rate             = 44100
blocksize               = 2
byte per sec            = 88200
bit depth               = 16
header_data_size        = 550912
sample count            = 275456
actual extracted samples extracted = 275456

Reference for PCM file with wav-header :

bytes      variable      description
0  - 3     'RIFF'/'RIFX' Little/Big-endian
4  - 7     wRiffLength   length of file minus the 8 byte riff header
8  - 11    'WAVE'
12 - 15    'fmt '
16 - 19    wFmtSize       length of format chunk minus 8 byte header
20 - 21    wFormatTag     identifies PCM, ULAW etc
22 - 23    wChannels
24 - 27    dwSamplesPerSecond  samples per second per channel
28 - 31    dwAvgBytesPerSec    non-trivial for compressed formats
32 - 33    wBlockAlign         basic block size
34 - 35    wBitsPerSample      non-trivial for compressed formats

PCM formats then go straight to the data chunk:
36 - 39    'data'
40 - 43     dwDataLength   length of data chunk minus 8 byte header
44 - (dwDataLength + 43)   the data
(+ a padding byte if dwDataLength is odd)

 API upgrades

  • No labels