Main question: What is the encoding used for wav files? Unsigned PCM 8-bit, or is it another type?
Context:
I accidentally formatted my sd card with data I had yet to download. I was able to recover using a recovery wizard (Recuva), but my files are now corrupted. They have the same file size, but most audio players give an error message about a missing codec. I found a suggestion to change to a RAW format and import into Audacity. Using the defaults this works and I can barely hear recorded sounds but only through a significant amount of white noise. This indicates the the data is there. (I had downloaded about half of the data - 9 days of recording - before some technical glitches, so all was not lost, but it would be nice to get the full dataset back.)
The default file specification in Audacity is:
Encoding: Unsigned 8-bit PCM
Byte order: Little-endian
Start offset: 0 bytes
Sample rate: 44100 Hz
Are these the correct file properties for AudioMoth wav files?
Hi Alex, I seem to be having a similar issue as discussed above. A student of mine recorded many wav files with Audiomoths which seemingly are corrupted. Based on file size the data should be there, but it seems the file header is missing. I've attempted recovery using Audacity with the settings you've provided above, 16-bit little-endian, 360 byte header, but that didn't work. An example file can be found here: https://www.dropbox.com/s/fad3wsinlef7won/20221005_182801.WAV?dl=0. Any help would be much appreciated!
If you email me at alex.rogers@cs.ox.ac.uk I can give you the address. It's probably worth doing as other people may run into the same issue and it would be quite useful to have a way to recover data AudioMoth recordings from a corrupted SD card.
I just had a thorough look through the uncorrupted sample set, and there are very few that are usable. They seem to be partially muted. I think the best decision here is to scrap the data set. I had 12 units out last year and have looked through about 10 sample sites from my most recent field season and haven't run into this kind of data quality issue yet, so I am a little surprised. I think that there was something about the way I set up the unit or the microphone was faulty or something along those lines.
I would be happy to send you the sd card via post - from Maryland, USA, but I just don't want to waste your time on this one. Thanks so much for the offer to look further into it!
That file seems to have a different problem in that it is all zeros.
I'll have a go at writing a C app to recover the files directly from the SD card.
Alternative, where are you based? It might be easier to post me the original SD card and then I can extract them directly from there.
That's great news. That file I sent was out of Recuva, and I only changed the file extension. But here's another one that I haven't changed at all. https://www.dropbox.com/s/0u4oj1g9sij0rz9/20190704_103800.WAV?dl=0
If you have an opinion on the easiest way to add the header I would appreciate it. I've found a few plausible methods through internet searching, but this will be my first go at it. Thanks!
This file has all the data in it but no header. If I add the appropriate WAV file header it is readable again and I can hear birdsong in it.
https://www.dropbox.com/s/hsbo8fvcptycw7k/20190801_171400.WAV?dl=0
If would probably be easiest to work directly from whatever comes out of the Recuva app. Can you post one of those files.
Thanks. I could use some help. I was hoping that just assigning the correct encoding type would help in Audacity, but I couldn't hear anything when set to signed 16-bit PCM (as opposed to when I used unsigned 8 bit PCM). Any suggestions for application or R package (tuneR function?) that could help me do as you suggest - dump file contents as binary and read sequentially - would be much appreciated. Here is an example file that I've changed the file type to RAW. This is a 60 sec recording with all the AudioMoth default recording options. https://www.dropbox.com/s/wshc3p4q0i9r0kc/20190801_171400.RAW?dl=0
They are signed 16-bit little-endian PCM samples. The WAV file header is always 488 bytes long and the data starts immediately afterwards. The first four bytes of each file header is 'RIFF' so in the worst case, dumping the file contents as binary data and reading through sequentially will recover all the data. Let us know if you need some help with this.