Friday, July 1, 2011

Control the Mouse

Due to readers' requests I today publish a post about the topic, how to control the mouse using C#, meaning to move the coursor over the whole screen and to simulate clicks for example.
I already wrote a post about simulating key presses, which can easy implemented with the function SendKeys().
Controlling the mouse is not that easy, there is (yet) no own .Net function, we have to use P/Invoke functions. Core is the function SendInput(). With this function different commands can be send to the executing computer, for example hardware inputs, keyboard and mouse events (called inputs in the following).
The function expects 3 arguments: The first describes the number of given inputs, the second a reference to the actual input data and the third saves the size of each input. The second parameter has to be a structure which contains at least the type of the input (so hardware, keyboard, mouse ...) aswell as the data of the input.
I called this structure input. Mouse input data have multiple data, as the X- and Y - coordinate, which I summarized in the structure MouseInputI.

Now first I want to describe moving the mouse: We need therefor an instance of the structure Input with the desired target coordinates.
As type of the input we set a 0, which describes a mouse event. The mouse data represented by an instance of MouseInput we create via the function CreateMouseInput(). This gets all needed parameters and sets them in the corresponding mouse result. For moving the mouse only the values for X and Y (target coordinates) are important aswell as for DwFlags. This array describes certain flags, as for example the pressing of a mouse button. In this case we set 2 previously defined constants, MOUSEEVENTF_ABSOLUTE and MOUSEEVENTF_MOVE, to indicate, that we want to move the mouse. The first says, that we want to create a mouse event on the whole screen and not only on our application window, the second indicates the desired moving of the mouse.

The simulation of a click works similiarily. We first have to create an input again with the desired data, this time though we use an array of length 2, as we want to simulate the pressing and concluding letting go of the mouse button.
With the function CreateMouseInput() we therefor create 2 mouse inputs and set the right constants for the DwFlags.
I hope, this explanation as well as the following code makes the topic clear:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;

using System.Runtime.InteropServices;


namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        // P/Invoke function for controlling the mouse
        [DllImport("user32.dll", SetLastError = true)]
        private static extern uint SendInput(uint nInputs, Input[] pInputs, int cbSize);

        /// <summary>
       /// structure for mouse data
        /// </summary>
        struct MouseInput
        {
            public int X; // X coordinate
            public int Y; // Y coordinate
            public uint MouseData; // mouse data, e.g. for mouse wheel
            public uint DwFlags; // further mouse data, e.g. for mouse buttons
            public uint Time; // time of the event
            public IntPtr DwExtraInfo; // further information
        }

        /// <summary>
       /// super structure for input data of the function SendInput
        /// </summary>
        struct Input {
            public int Type; // type of the input, 0 for mouse  
            public MouseInput Data; // mouse data
        }

        // constants for mouse flags
        const uint MOUSEEVENTF_LEFTDOWN = 0x0002; // press left mouse button
        const uint MOUSEEVENTF_LEFTUP = 0x0004; // release left mouse button
        const uint MOUSEEVENTF_ABSOLUTE = 0x8000; // whole screen, not just application window
        const uint MOUSEEVENTF_MOVE = 0x0001; // move mouse

        private MouseInput CreateMouseInput(int x, int y, uint data, uint time, uint flag)
        {
            // create from the given data an object of the type MouseInput, which then can be send
            MouseInput Result = new MouseInput();
            Result.X = x;
            Result.Y = y;
            Result.MouseData = data;
            Result.Time = time;
            Result.DwFlags = flag;
            return Result;
        }

        private void SimulateMouseClick()
        {
            // Linksklick simulieren: Maustaste drücken und loslassen
            Input[] MouseEvent = new Input[2];
            MouseEvent[0].Type = 0;
            MouseEvent[0].Data = CreateMouseInput(0, 0, 0, 0, MOUSEEVENTF_LEFTDOWN);

            MouseEvent[1].Type = 0; // INPUT_MOUSE;
            MouseEvent[1].Data = CreateMouseInput(0, 0, 0, 0, MOUSEEVENTF_LEFTUP);

            SendInput((uint)MouseEvent.Length, MouseEvent, Marshal.SizeOf(MouseEvent[0].GetType()));
        }

        private void SimulateMouseMove(int x, int y) {
            Input[] MouseEvent = new Input[1];
            MouseEvent[0].Type = 0;
            // move mouse: Flags ABSOLUTE (whole screen) and MOVE (move)
            MouseEvent[0].Data = CreateMouseInput(x, y, 0, 0, MOUSEEVENTF_ABSOLUTE | MOUSEEVENTF_MOVE);
            SendInput((uint)MouseEvent.Length, MouseEvent, Marshal.SizeOf(MouseEvent[0].GetType()));
        }


         private void button1_Click(object sender, EventArgs e)
         {
             // move mouse
             SimulateMouseMove(0, 0);
             // click
             SimulateMouseClick();
         }
    }
}

Saturday, May 7, 2011

Restart Application With Administrative Rights

In an older post I showed, how to provide a C# application with administrative rights. For this though a file, the application manifest file, was needed.
Responding to a reader's request I want to present a little trick, how to equip an application with code alone with this rights.
Necessary for this is the class System.Diagnostics, which can start a process with any arguments. So we let, if the application is started with normal rights, restart itself with administrative rights.
To avoid an infinite loop, we add a new command line argument to the call.
The following code should demonstrate this idea:

private void Form1_Load(object sender, EventArgs e)
{
    string[] CommandLineArgs = Environment.GetCommandLineArgs(); // read-out command line arguments
    if (CommandLineArgs.Length &lt;= 1 || CommandLineArgs[1] != "restarted"// if the 2. argument (the 1. is always the file path) != restarted, the program has not been restarted yet
    {
        ProcessStartInfo ProcessInfo = new ProcessStartInfo(Application.ExecutablePath, "restarted"); // set process information, application path and "restarted"
        ProcessInfo.Verb = "runas"// command for exectution as administrator
        Process.Start(ProcessInfo); // start process with the desired isettings
    }
    else
        Environment.Exit(0); // if the argument "restarted" was set, exit to avoid infinite loop
}

Friday, May 6, 2011

Get Command Line Arguments

Generally programs in Windows can be called with various arguments, which are called command line arguments.
For example some programs can be called directly with the file which is to be opened.
In console applications this arguments are directly accessible as parameters of the function Main().
In Windows Forms Application they are not, but we can get them with the function Environment.GetCommandLineArgs(), which returns the command line arguments as an array of the stype string.
The following code reads-out the passed arguments at program start:

string[] CommandLineArgs = Environment.GetCommandLineArgs();

As you can see, as first parameter always path and name of the executing program is  set.

Friday, April 1, 2011

Create a JumpList Using C#

The JumpList in Windows 7 is the context menu of an application in the taskbar, which pops up after right cloking the application symbol.
In this menu often opened files are visible as well as options for the program.
For the C# studio my JumpList looks for example as follows:



In C# we now can create and edit JumpLists for own application. Although there is already one class for JumpLists in .Net, for Windows Forms Application we still need the Windows API Code Pack, since the provided class is only accessible for WPF applications.
The installation of the Windows API Code Pack I described here, the 2 created dll files have to be included afterwards (Project - Add Reference - Browse). The needed files are "Microsoft.WindowsAPICodePack.dll" and "Microsoft.WindowsAPICodePack.Shell.dll", they are contained in the installation folder of the code pack in Shell\bin\Debug.
The class JumpList used for this post is contained in Microsoft.WindowsAPICodePack.Taskbar.
In a JumpList by default 2 categories are present: "Tasks" and "Recent" (recently opened files).
These categories can be filled with the functions AddUserTasks() and AddToRecent()
As entries I use in the post objects of the class JumpListLink, which contain a name or a target, which can be an executable program or a file.
The category "Recent" though cannot be filled that easily, since Windows allways only files with endings associated to the program. With a little trick this difficulty can be avoided though. You can add own categories in JumpLists - if you know create the category "Recently Used" for example, the result is the same.
The sample code creates 2 new entries in "Tasks" in the JumpList - one entry for Paint and one for the Windows calculator. For these programs, also icons are provided - which are set as IconReference directly from the original program.
Furthermore this code creates a new category and creates a link to a file on my desktop.
The concluding Refresh() is important to execute the changes.
The JumpList is also visible in the same form when starting the program the next time - if it was not deleted or change.
The result looks as follows:



Now the code, at the beginning a
using Microsoft.WindowsAPICodePack.Taskbar; is needed:

            JumpList CustomJumpList = JumpList.CreateJumpList(); // create new JumpList

            // Add Link to Paint to Tasks
            JumpListLink EntryPaint = new JumpListLink(@"C:\Windows\System32\mspaint.exe""Paint");
            EntryPaint.IconReference = new Microsoft.WindowsAPICodePack.Shell.IconReference(@"C:\Windows\System32\mspaint.exe", 0);
            CustomJumpList.AddUserTasks(EntryPaint);

            // Add link to Calculator
            JumpListLink EntryCalc = new JumpListLink(@"C:\Windows\System32\calc.exe""Rechner");
            EntryCalc.IconReference = new Microsoft.WindowsAPICodePack.Shell.IconReference(@"C:\Windows\System32\calc.exe", 0);
            CustomJumpList.AddUserTasks(EntryCalc);

            // create custom category
            JumpListCustomCategory CustomCategory = new JumpListCustomCategory("Neue Kategorie");
            // create link to file in this
            CustomCategory.AddJumpListItems(new JumpListLink(@"C:\Users\User\Desktop\C#.txt""C# Links"));
            CustomJumpList.AddCustomCategories(CustomCategory);

            // update changes
            CustomJumpList.Refresh();

Wednesday, March 23, 2011

Speech Recognition Part 2 - Command Mode

As announced, this post is a sequel to the previous one, so the topic is again speech recognition, this time though the Command Mode.
As the name already suggests, the recorded sound is analyzed for specific commands. Of course you could use the previously introduced Dictation Mode and look for commands in the recorded texts yourself, but the Command Mode offers some advantages.
One the one hand it enables a continous checking, if commands were spoken, and on the other hand the recognition is better, since the program knows only the commands and thus has to distinguish just between them.
Now directly to the code: The requirements (including the ressources etc). are the same as in the previous post.
For recognizing the commands we now however use an instance of the class SpeechRecognizer. This expects also a grammar, which we now have to fill manually with words, which should be recognized
By setting the property Enabled to true, the speech recognition is activated.
When a command is recognized, the event SpeechRecognized is triggered, we assign a function to this, so we can use the event.
If a SpeechRecognizer is initialized, a window opens for speech recognition - here the user has to push the big microphone button, to start the recognition.
Pressing the button has nothing to do with the property Enabled - both have to be activated. A way to start the speech recognition only by code and to let the window stay hidden I unfortunately did not find. But I would be interested if somebody knows a way - so if someone has an idea, please let me know!
The code:

        private void StartListening()
        {
            SpeechRecognizer SR = new SpeechRecognizer();
            // register the commands
            Choices Commands = new Choices();
            Commands.Add("Stop");
            Commands.Add("Los");
            GrammarBuilder GB = new GrammarBuilder(Commands); // load the commands with a GrammerBuilder
            Grammar CommandGrammar = new Grammar(GB); // create a grammar with the GrammarBuilder
            SR.LoadGrammar(CommandGrammar); // load the grammar
            SR.SpeechRecognized += CommandRecognized; // function to handle the event
            SR.Enabled = true;
        }

        private void CommandRecognized(object sender, SpeechRecognizedEventArgs e)
        {
            string Command = e.Result.Text;
            // here more code to process ...
        }

Tuesday, March 22, 2011

Speech Recognition Part 1 - Dictation Mode

The previous post was about speech output with .Net, this post is dedicated to the inverse, namely speech recognition.
This is much harder for computers than the output, but the .Net framework again provides ready-made functions of the class System.Speech.Recognition, over which speech recognition can be realized with little effort.
In general there are 2 modes in which speech recognition can be run: This post is about the Dictation Mode, the next about the Command Mode.
The Dictation Mode is, as the name already suggests, suited for dictating texts. The recorded sound is understood as a dictate and the program tries to understand the spoken words.
As for the speech output, first a reference to the class System.Speech has to be included. Now though the needed subclass is Recognition, so we first use the following using directive:

using System.Speech.Recognition;

For speech recognizing we use an instance of the class SpeechRecognitionEngine. This needs a grammar, which is kind of a command list, on how to interpret the language.
As a grammar we hand over an instance of the class DictationGrammar, to indicate, that we want to use the dictation mode.
The recognition of spoken words now works with the function Recognize(). This prepares for recognition and starts it, when the microphone records sound. Is the speaker takes a break (the needed duration can be set), the speech recognition is finished and the program now tries to interpret the sound as words (an asynchone recognition is also possible). Finally the dictation result is returned.
Now the code:

            SpeechRecognitionEngine SRE = new SpeechRecognitionEngine();
            SRE.LoadGrammar(new DictationGrammar()); // load dictation grammar
            SRE.SetInputToDefaultAudioDevice(); // set recording souce to default

            RecognitionResult Result = SRE.Recognize(); // record sound and recognize
            string ResultString = "";
            // add all recognized words to the result string
            foreach (RecognizedWordUnit w in Result.Words)
            {
                ResultString += w.Text;
            }

Monday, March 21, 2011

Speech Output with C#

The title probably sounds difficult, but in C# the realization of speech output can be done with only 2 lines of code.
This is, because the speech synthesis already known from Windows can also be used in the .Net framework, the needed component for that is the class System.Speech.Synthesis. To be able to use it, it first has to be included: Project - Add Reference - (Tab .Net) System.Speech. To simplify the code, we include it also via using:

using System.Speech.Synthesis;

For the speech output we need an instance of the class SpeechSynthesizer, which will output spoken text when the function Speak() is called.
But, of course there are also further properties, with which can be experimented.
A little overview:
 - Rate: Speed of the text output, ranges from -10 to 10
 - Volume: Volume, ranges from 0 - 100
 - Voice: Voice
The last property is read-only, but it can be changed with some functions.
Useful is the function SelectVoiceByHints(), which expects "hints" according to which a voice is selected. Such criteria are for example gender, age and even culture group.
The following command looks for a voice, which sounds female and grown up and sets this voice as new outout voice:

Speaker.SelectVoiceByHints(VoiceGender.Female, VoiceAge.Adult);

However, on most Windows computers there will only be one preinstalled voice (at my computer for example only "Microsoft Anna"). Then the voice search does not have any influence.
A list of installed voices can be obtained via Speaker.GetInstalledVoices(). The single voices are represented as objects of the type Voice, the following example reads out all installed voices and adds them to the list InstalledVoices:

List<string> InstalledVoices = new List<string>();
foreach (InstalledVoice voice in Speaker.GetInstalledVoices())
{
    InstalledVoices.Add(voice.VoiceInfo.Name);
}

Friday, March 18, 2011

Determine Memory Usage of a Process

Today's post becomes, after the long preceeding posts, a bit shorter again.
I want to show you, how to determine the memory usage of a specific process.
The property needed for that is called WorkingSet64 from the class System.Diagnostics.Process. It returns the number of bytes, which are occupied by the process in the RAM.
To read out this number for some process, this first has to be "grabbed", for example with the function Process.GetProcessesByName() (some background information about that I provided in this post).
The function looks for processes of the given name and saves them in the resulting array.
The following code determines the memory usage of the given process by determining above mentioned property of the first process in the list - so if there are multiple processes of the same name, the code has to be altered probably.
The example determines the memory usage of the first opened Notepad window:

Process[] Application;
Application = Process.GetProcessesByName("notepad");
long MemorySize = Application[0].WorkingSet64;

Monday, March 14, 2011

Mix Wave Files Together (Crossfade)

In the previous two posts the Wave format got analyzed, I showed how to read and write Wave files with C#.
In this post I want to demonstrate a "cool" use of that, namely creating a program, which crossfades 2 songs.
The mixing happens by fading the songs, the volume of the first song is decreased towards the end, the volume of the second song is increased gradually.
Many will probably hope for an application doing the same for MP3 files - unfortunately the MP3 format is much more complicated than the Wave format. But arbitrary MP3 files can be converted with the prorgam Audacity for example.
The code used here still uses the class WaveFile, which was started in the previous two posts.
A sample run of mixing 2 Wave files could look as follows:

            WaveFile WF1 = new WaveFile();
            WF1.LoadWave(@"C:\Users\User\Desktop\101-die_atzen_-_disco_pogo-ysp.wav");

            WaveFile WF2 = new WaveFile();
            WF2.LoadWave(@"C:\Users\User\Desktop\StroboPopDieAtzenFeatNena.wav");

            WaveFile.StoreMixWave(@"C:\Users\User\Desktop\mixed.wav", WF1, WF2, 10);

So first the 2 Wave files are read in and eventually the static function StoreMixWave() is called, which expects as arguments the path to the target file, the Wave files to be merged and the fading time.
The function code is:

public static void StoreMixWave(string path, WaveFile wf1, WaveFile wf2, int fadeTime)
{
    WaveFile Mixed = MixWave(wf1, wf2, fadeTime); // mix result file
    Mixed.StoreWave(path); // save result file
}

So the resulting Wave file is created in the function MixWave() and is then written with the known function StoreWave().
Now let us directly view the code of MixWave():

        private static WaveFile MixWave(WaveFile wf1, WaveFile wf2, int fadeTime)
        {
            int FadeSamples = fadeTime * wf1.ByteRate / wf1.NumChannels; // number of samples affected by the crossfading
            int FadeBytes = fadeTime * wf1.ByteRate; // number of affected bytes

            WaveFile Result = new WaveFile(); // resulting Wave file
            Result.FileSize = wf1.FileSize + wf2.DataSize - 2 * FadeBytes; // file size
            Result.Format = "WAVE";

            // copy information from the fmt  chunk
            Result.FmtChunkSize = wf1.FmtChunkSize;
            Result.AudioFormat = wf1.AudioFormat;
            Result.NumChannels = wf1.NumChannels;
            Result.SampleRate = wf1.SampleRate;
            Result.ByteRate = wf1.ByteRate;
            Result.BlockAlign = wf1.BlockAlign;
            Result.BitsPerSample = wf1.BitsPerSample;

            Result.DataSize = wf1.DataSize + wf2.DataSize - 2 * FadeBytes; // new size of the data chunk
            Result.Data = new int[wf1.NumChannels][]; // copy number of channels
            int NumSamples = Result.DataSize / (Result.NumChannels * ( Result.BitsPerSample / 8)); //  number of samples in resulting file

            // initialize data arrays for all samples and channels
            for (int i = 0; i < Result.Data.Length; i++)
            {
                Result.Data[i] = new int[NumSamples];
            }

            int PosCounter = 0; // position of the current sample in the Wave file

            // copy the samples from the first Wave file into the data field of the result file
            for (int i = 0; i < wf1.Data[0].Length; i++)
            {
                // copy the current sample for all channels
                for (int j = 0; j < wf1.NumChannels; j++)
                {
                    // if the current sample is in the crossfading time, mix the amplitude value of the 1st file with the amplitude value of the 2nd file
                    if (i > wf1.Data[0].Length - FadeSamples)
                       Result.Data[j][PosCounter] = (int)(wf1.Data[j][i] * Factor(i - (wf1.Data[0].Length - FadeSamples), FadeSamples, 0) + wf2.Data[j][i - (wf1.Data[0].Length - FadeSamples)] * Factor(i - (wf1.Data[0].Length - FadeSamples), FadeSamples, 1));
                    else
                       Result.Data[j][PosCounter] = wf1.Data[j][i];
                }
                PosCounter++;
            }

            // copy the remaining samples
            for (int i = FadeSamples; i < wf2.Data[0].Length; i++)
            {
                for (int j = 0; j < wf1.NumChannels; j++)
                {
                    Result.Data[j][PosCounter] = wf2.Data[j][i];
                }
                PosCounter++;
            }
            return Result;
        }

At the beginning of it the number of samples is calculated, which are affected by the crossfading. Remember: A Wave file can be viewed as a collection of amplitudes. With some sampling rate at specific time intervals these amplitures are measured and saved in the file. These single values are the samples.
To lay 2 audio files above each other, one can simply add the amplitues.
The number of affected samples is calculated as the product of crossfading time in seconds and byterate (number of bytes per second), divided by the number of channels (since the byterate describes the number of bytes from all channels).
Then the files are mixed together:
The first two blocks ("RIFF" and "fmt ") are just copied from the first Wave file - of course though the file size is changed accordingly. The new file size is the sum of the old ones, minus 2times the number of bytes which are crossed.
In order to get a good result, we here ask the requirement, that both Wave files have a similiar fmt  chunk, meaning the same sample rate, number of channels etc. But this is the case anyways, if the Wave files were created with Audacity.
The crossfading itself is a little bit more complicated then. First the size of the resulting data block is calculated and from that the number of samples. This number is calculated by the formula (size of data chunk) / (number channels * BitsPerSample / 8).
The variable PosCounter  counts the position in the Data array of the result file.
With 2 loops now all samples and channels of the first Wave file are iterated. The corresponding data from the Data array are then written to position PosCounter in the Data array of the result file.
If the loop has reached a position which lies in the crossfade zone, the samples are not anymore only copied but now mixed with the ones of the second file.
In order for the first song to become quieter and the first one to become louder, the function Factor() calculates the corresponding percentage of the resulting amplitude.
Finally, the not yet viewed samples of the 2nd file are copied to the resulting file and the mix is complete.
At last, one overview of the code of the whole class WaveFile:


    public class WaveFile  
    {
        int FileSize; // 1
        string Format; // 2
        int FmtChunkSize; // 3
        int AudioFormat; // 4
        int NumChannels; // 5
        int SampleRate; // 6
        int ByteRate; // 7
        int BlockAlign; // 8
        int BitsPerSample; // 9
        int DataSize; // 10

        int[][] Data; // 11

        #region Einlesen
        public void LoadWave(string path)
        {
            System.IO.FileStream fs = System.IO.File.OpenRead(path); // zu lesende Wave Datei öffnen
            LoadChunk(fs); // RIFF Chunk einlesen
            LoadChunk(fs); // fmt Chunk einlesen
            LoadChunk(fs); // data Chunk einlesen
        }

        private void LoadChunk(System.IO.FileStream fs)
        {
            System.Text.ASCIIEncoding Encoder = new ASCIIEncoding();

            byte[] bChunkID = new byte[4];
            /* Die ersten 4 Bytes einlesen.
            Diese ergeben bei jedem Chunk den jeweiligen Namen. */
            fs.Read(bChunkID, 0, 4);
            string sChunkID = Encoder.GetString(bChunkID); // den Namen aus den Bytes dekodieren

            byte[] ChunkSize = new byte[4];
            /* Die nächsten 4 Bytes ergeben bei jedem Chunk die Größenangabe. */
            fs.Read(ChunkSize, 0, 4);

            if (sChunkID.Equals("RIFF"))
            {
                // beim Riff Chunk ...
                // die Größe in FileSize speichern
                FileSize = System.BitConverter.ToInt32(ChunkSize, 0);
                // das Format einlesen
                byte[] Format = new byte[4];
                fs.Read(Format, 0, 4);
                // ergibt "WAVE" als String
                this.Format = Encoder.GetString(Format);
            }

            if (sChunkID.Equals("fmt "))
            {
                // beim fmt Chunk die Größe in FmtChunkSize speichern
                FmtChunkSize = System.BitConverter.ToInt32(ChunkSize, 0);
                // sowie die anderen Informationen auslesen und speichern
                byte[] AudioFormat = new byte[2];
                fs.Read(AudioFormat, 0, 2);
                this.AudioFormat = System.BitConverter.ToInt16(AudioFormat, 0);
                byte[] NumChannels = new byte[2];
                fs.Read(NumChannels, 0, 2);
                this.NumChannels = System.BitConverter.ToInt16(NumChannels, 0);
                byte[] SampleRate = new byte[4];
                fs.Read(SampleRate, 0, 4);
                this.SampleRate = System.BitConverter.ToInt32(SampleRate, 0);
                byte[] ByteRate = new byte[4];
                fs.Read(ByteRate, 0, 4);
                this.ByteRate = System.BitConverter.ToInt32(ByteRate, 0);
                byte[] BlockAlign = new byte[2];
                fs.Read(BlockAlign, 0, 2);
                this.BlockAlign = System.BitConverter.ToInt16(BlockAlign, 0);
                byte[] BitsPerSample = new byte[2];
                fs.Read(BitsPerSample, 0, 2);
                this.BitsPerSample = System.BitConverter.ToInt16(BitsPerSample, 0);
            }

            if (sChunkID == "data")
            {
                // beim data Chunk die Größenangabe in DataSize speichern
                DataSize = System.BitConverter.ToInt32(ChunkSize, 0);

                // der 1. Index von Data spezifiziert den Audiokanal, der 2. das Sample
                Data = new int[this.NumChannels][];
                // Temporäres Array zum Einlesen der jeweiligen Bytes eines Kanals pro Sample
                byte[] temp = new byte[BlockAlign / NumChannels];
                // für jeden Kanal das Data Array auf die Anzahl der Samples dimensionalisieren
                for (int i = 0; i < this.NumChannels; i++)
                {
                    Data[i] = new int[this.DataSize / (NumChannels * BitsPerSample / 8)];
                }

                // nacheinander alle Samples durchgehen
                for (int i = 0; i < Data[0].Length; i++)
                {
                    // alle Audiokanäle pro Sample durchgehen
                    for (int j = 0; j < NumChannels; j++)
                    {
                        // die jeweils genutze Anzahl an Bytes pro Sample und Kanal einlesen
                        if (fs.Read(temp, 0, BlockAlign / NumChannels) > 0)
                        {   // je nachdem, wie viele Bytes für einen Wert genutzt werden,
                            // die Amplitude als Int16 oder Int32 interpretieren
                            if (BlockAlign / NumChannels == 2)
                                Data[j][i] = System.BitConverter.ToInt16(temp, 0);
                            else
                                Data[j][i] = System.BitConverter.ToInt32(temp, 0);
                        }
                        /* else
                         * andere Werte als 2 oder 4 werden nicht behandelt, hier bei Bedarf ergänzen!
                        */
                    }
                }
            }
        }
        #endregion

        #region Schreiben
        public void StoreWave(string path)
        {
            System.IO.FileStream fs = System.IO.File.OpenWrite(path); // zu schreiben Wave Datei öffnen / erstellen
            StoreChunk(fs, "RIFF"); // RIFF Chunk schreiben
            StoreChunk(fs, "fmt "); // fmt Chunk schreiben
            StoreChunk(fs, "data"); // data Chunk schreiben
        }

        private void StoreChunk(System.IO.FileStream fs, string chunkID)
        {
            System.Text.ASCIIEncoding Decoder = new ASCIIEncoding();
            // den Namen in Bytes konvertieren und schreiben
            fs.Write(Decoder.GetBytes(chunkID), 0, 4);

            if (chunkID == "RIFF")
            {
                // im RIFF Chunk, FileSize als Größe und das Audioformat schreiben
                fs.Write(System.BitConverter.GetBytes(FileSize), 0, 4);
                fs.Write(Decoder.GetBytes(Format), 0, 4);
            }
            if (chunkID == "fmt ")
            {
                // beim fmt Chunk die Größe dieses sowie die weiteren kodierten Informationen schreiben
                fs.Write(System.BitConverter.GetBytes(FmtChunkSize), 0, 4);
                fs.Write(System.BitConverter.GetBytes(AudioFormat), 0, 2);
                fs.Write(System.BitConverter.GetBytes(NumChannels), 0, 2);
                fs.Write(System.BitConverter.GetBytes(SampleRate), 0, 4);
                fs.Write(System.BitConverter.GetBytes(ByteRate), 0, 4);
                fs.Write(System.BitConverter.GetBytes(BlockAlign), 0, 2);
                fs.Write(System.BitConverter.GetBytes(BitsPerSample), 0, 2);
            }
            if (chunkID == "data")
            {
                // beim data Chunk die Größe des Datenblocks als Größenangabe schreiben
                fs.Write(System.BitConverter.GetBytes(DataSize), 0, 4);
                // dann die einzelnen Amplituden, wie beschrieben Sample für Sample mit jeweils allen
                // Audiospuren, schreiben
                for (int i = 0; i < Data[0].Length; i++)
                {
                    for (int j = 0; j < NumChannels; j++)
                    {
                        fs.Write(System.BitConverter.GetBytes(Data[j][i]), 0, BlockAlign / NumChannels);
                    }
                }
            }
        }
        #endregion

        #region Mischen
        private static WaveFile MixWave(WaveFile wf1, WaveFile wf2, int fadeTime)
        {
            int FadeSamples = fadeTime * wf1.ByteRate / wf1.NumChannels; // Anzahl an aus-/ einzublenden Samples
            int FadeBytes = fadeTime * wf1.ByteRate; // Anzahl an aus-/ einzublendenden Bytes

            WaveFile Result = new WaveFile(); // Ergebnis Wave Datei
            Result.FileSize = wf1.FileSize + wf2.DataSize - 2 * FadeBytes; // neue Dateigröße
            Result.Format = "WAVE";

            // Informationen aus dem fmt Chunk übernehmen
            Result.FmtChunkSize = wf1.FmtChunkSize;
            Result.AudioFormat = wf1.AudioFormat;
            Result.NumChannels = wf1.NumChannels;
            Result.SampleRate = wf1.SampleRate;
            Result.ByteRate = wf1.ByteRate;
            Result.BlockAlign = wf1.BlockAlign;
            Result.BitsPerSample = wf1.BitsPerSample;

            Result.DataSize = wf1.DataSize + wf2.DataSize - 2 * FadeBytes; // neue Größe des Data Chunks
            Result.Data = new int[wf1.NumChannels][]; // Anzahl an Kanälen übernehmen
            int NumSamples = Result.DataSize / (Result.NumChannels * ( Result.BitsPerSample / 8)); // Anzahl an Samples ausrechnen, die sich in de Ergebnisdatei ergeben

            // Die Data Arrays für alle Kanäle auf die Anzahl der Samples dimensionieren.
            for (int i = 0; i < Result.Data.Length; i++)
            {
                Result.Data[i] = new int[NumSamples];
            }

            int PosCounter = 0; // Position des aktuellen Samples in der Ergebnisdatei

            // die Samples aus der ersten Wave Datei in das Data Feld der Ergebnisdatei kopieren
            for (int i = 0; i < wf1.Data[0].Length; i++)
            {
                // das aktuelle Sample in allen Kanälen übernehmen
                for (int j = 0; j < wf1.NumChannels; j++)
                {
                    // fällt das aktuelle Sample in die Zeit, die überblendet werden soll, den Amplitudenwert der 1. Datei mit dem Amplitudenwert aus der 2. Datei mischen
                    if (i > wf1.Data[0].Length - FadeSamples)
                       Result.Data[j][PosCounter] = (int)(wf1.Data[j][i] * Factor(i - (wf1.Data[0].Length - FadeSamples), FadeSamples, 0) + wf2.Data[j][i - (wf1.Data[0].Length - FadeSamples)] * Factor(i - (wf1.Data[0].Length - FadeSamples), FadeSamples, 1));
                    else
                       Result.Data[j][PosCounter] = wf1.Data[j][i];
                }
                PosCounter++;
            }

            // die restlichen Samples in die Ergebnisdatei übernehmen
            for (int i = FadeSamples; i < wf2.Data[0].Length; i++)
            {
                for (int j = 0; j < wf1.NumChannels; j++)
                {
                    Result.Data[j][PosCounter] = wf2.Data[j][i];
                }
                PosCounter++;
            }
            return Result;
        }
       
        /// <summary>
       /// Diese Funktion dient zur Berechnung der Gewichtung der Amplituden bei Übermischung.
        /// </summary>
        /// <param name="pos">Position in Datei relativ zum Anfang der Überblendezeit</param>
        /// <param name="max">Ende der Überblendung, relativ zu pos</param>
        /// <param name="song">Kann die Werte 0 (auszublendender Song) oder 1 (einzublender Song) annehmen</param>
        /// <returns></returns>
        private static double Factor(int pos, int max, int song)
        {
            if (song == 0)
                return 1 - Math.Pow((double)pos / (double)max, 2);
            else
                return Math.Pow((double)pos / (double)max, 2);
        }

        public static void StoreMixWave(string path, WaveFile wf1, WaveFile wf2, int fadeTime)
        {
            WaveFile Mixed = MixWave(wf1, wf2, fadeTime); // Ergebnisdatei mischen
            Mixed.StoreWave(path); // Ergebnisdatei auf Festplatte speichern
        }

        #endregion
    }

Thursday, March 10, 2011

Write Wave Files

In the previous post was shown, how to read in Wave files byte by byte. Now I want to show, how to write them.
So this post is a complete reverse of the previous one, I will show how to write a previously read in Wave file again to the hard disk.
In the previous post I explained the structure and other basics about Wave files, so I here just post briefly the source code of 2 functions, with which the class WaveFile can be extended.
The principle is analogous to reading, one by one the 3 blocks "RIFF", "fmt " and "data" are written to a file.

        public void StoreWave(string path)
        {
            System.IO.FileStream fs = System.IO.File.OpenWrite(path); // open target file
            StoreChunk(fs, "RIFF"); // write RIFF chunk
            StoreChunk(fs, "fmt "); // write fmt chunk
            StoreChunk(fs, "data"); // write data chunk
        }

        private void StoreChunk(System.IO.FileStream fs, string chunkID)
        {
            System.Text.ASCIIEncoding Decoder = new ASCIIEncoding();
            // convert the name in bytes and write it
            fs.Write(Decoder.GetBytes(chunkID), 0, 4);

            if (chunkID == "RIFF")
            {
                // in the RIFF chunk, write FileSize and the audio format
                fs.Write(System.BitConverter.GetBytes(FileSize), 0, 4);
                fs.Write(Decoder.GetBytes(Format), 0, 4);
            }
            if (chunkID == "fmt ")
            {
                // in the fmt chunk, write its size as well as the other information
                fs.Write(System.BitConverter.GetBytes(FmtChunkSize), 0, 4);
                fs.Write(System.BitConverter.GetBytes(AudioFormat), 0, 2);
                fs.Write(System.BitConverter.GetBytes(NumChannels), 0, 2);
                fs.Write(System.BitConverter.GetBytes(SampleRate), 0, 4);
                fs.Write(System.BitConverter.GetBytes(ByteRate), 0, 4);
                fs.Write(System.BitConverter.GetBytes(BlockAlign), 0, 2);
                fs.Write(System.BitConverter.GetBytes(BitsPerSample), 0, 2);
            }
            if (chunkID == "data")
            {
                // in the data chunk, write the size of the data block
                fs.Write(System.BitConverter.GetBytes(DataSize), 0, 4);
                // then write the single samples for all audio channels 
                for (int i = 0; i < Data[0].Length; i++)
                {
                    for (int j = 0; j < NumChannels; j++)
                    {
                        fs.Write(System.BitConverter.GetBytes(Data[j][i]), 0, BlockAlign / NumChannels);
                    }
                }
            }
        }

A sample implementation, which reads in a Wave file and eventually writes it again to another file, could look as follows:

WaveFile WF1 = new WaveFile();
WF1.LoadWave(@"C:\Users\User\Desktop\mix.wav");
WF1.StoreWave(@"C:\Users\User\Desktop\stored.wav");

Monday, March 7, 2011

Read in Wave Files

In this post I want to show, how to read in Wave files using the programming language C#. This post is not about playing these files etc., but about the exact analysis of the file format, so about how a Wave file is made up and how it can be read byte by byte (for playing, see for example here).

A Wave file is an audiofile, which saves sound frequencies.
It is made up of multiple chunks (blocks). With these a lot can be done, in this post though I will just describe the easiest format (but which is the default format though) of Wave files, consisting of 3 chunks. For the technical specification of the Wave format see this page. On this page (and of course, on many others too), the format is explained very neatly, I just want to sum up this information here and leave out a lot.
A little tip: Wave files with extact the structure described here, can very well be created with the free program Audacity (for example out of MP3 files).

The first 2 entries in a chunk are always the same: With 4 bytes each name and size of the chunk are coded.

The first chunk is named "RIFF". The following size indication (1) describes the size of the whole Wave file - 9, since name and size are not counted for that. The 3. position in the 1. block says "WAVE" (2).

Now comes the 2nd chunk, bearing the name "fmt " (the space is important). The size indication (3) has the value 16, it describes the size of this block. In the next 2 bytes the audio format (4) is saved: 1 describes default saving, other values describe a compression. The next 2 bytes code the number of audio channels (5). The next 4 bytes describe the sampling rate per second (6), saying how many values of the audio signal are saved per second. The next 4 bytes code the byterate (7), describing how many bytes per second need to be read for playing the audio signal. The next 2 bytes represent the number of bytes, which are used to describe a single sample value (taking all audio channels into account) (8). The last 2 bytes of this block code the number of bits (!), used for saving a single sample value of one channel (9).

Then follows the 3rd block, the actual data block.
As always, the first 4 bytes encode name ("data") and size (10) of the block. Then the actual data of the Wave file (11) follows, which are basically audio amplitudes.
The samples are saved in a row, the different audio channels directly follow each other. That means, in the "data" block, we first have the values of sample 1, here first comes audio channel 1, then 2 ... etc, then sample 2 with the same structure and so on. The right number of bytes per sample and channel describe, when interpreted as an integer, the amplitude at the current time.

With that my little description of the Wave format is finished, now comes the code of the C# program, which reads in Wave files which are structured like described above.
Core of the program is the class WaveFile, it provides a function to read in Wave files and saves as an instance the information of the file. The characteristic values mentioned above (number of channels etc.) are marked in the source code with the corresponding numbers.
The function to read the Wave file is LoadWave(), which expects the path of the file.
In this function then 3 times LoadChunk() is called, which reads a block. First the name of the block is analyzed and then determined, what to do.
I hope source code is clear.
As for all projects presented here it holds: The program is just a first hint towards further work.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            WaveFile WF1 = new WaveFile();
            WF1.LoadWave(@"C:\Users\User\Desktop\mix.wav");
        }
    }

    public class WaveFile 
    {
        int FileSize; // 1
        string Format; // 2
        int FmtChunkSize; // 3
        int AudioFormat; // 4
        int NumChannels; // 5
        int SampleRate; // 6
        int ByteRate; // 7
        int BlockAlign; // 8
        int BitsPerSample; // 9
        int DataSize; // 10

        int[][] Data; // 11

        public void LoadWave(string path)
        {
            System.IO.FileStream fs = System.IO.File.OpenRead(path); // open Wave file
            LoadChunk(fs); // read RIFF chunk
            LoadChunk(fs); // read fmt chunk
            LoadChunk(fs); // read data chunk
        }

        private void LoadChunk(System.IO.FileStream fs)
        {
            System.Text.ASCIIEncoding Encoder = new ASCIIEncoding();

            byte[] bChunkID = new byte[4];
            /* read the first 4 bytes, which should be the name */
            fs.Read(bChunkID, 0, 4);
            string sChunkID = Encoder.GetString(bChunkID); // decode the name

            byte[] ChunkSize = new byte[4];
            /* the next 4 bytes code the size */
            fs.Read(ChunkSize, 0, 4);

            if (sChunkID.Equals("RIFF"))
            {
                // what to do with the RIFF chunk:
                // save size in FileSize
                FileSize = System.BitConverter.ToInt32(ChunkSize, 0);
                // determine the format
                byte[] Format = new byte[4];
                fs.Read(Format, 0, 4);
                // should be "WAVE" as string
                this.Format = Encoder.GetString(Format);
            }

            if (sChunkID.Equals("fmt "))
            {
                // in the fmtChunk: Save size in FmtChunkSize
                FmtChunkSize = System.BitConverter.ToInt32(ChunkSize, 0);
                // readout all the other header information
                byte[] AudioFormat = new byte[2];
                fs.Read(AudioFormat, 0, 2);
                this.AudioFormat = System.BitConverter.ToInt16(AudioFormat, 0);
                byte[] NumChannels = new byte[2];
                fs.Read(NumChannels, 0, 2);
                this.NumChannels = System.BitConverter.ToInt16(NumChannels, 0);
                byte[] SampleRate = new byte[4];
                fs.Read(SampleRate, 0, 4);
                this.SampleRate = System.BitConverter.ToInt32(SampleRate, 0);
                byte[] ByteRate = new byte[4];
                fs.Read(ByteRate, 0, 4);
                this.ByteRate = System.BitConverter.ToInt32(ByteRate, 0);
                byte[] BlockAlign = new byte[2];
                fs.Read(BlockAlign, 0, 2);
                this.BlockAlign = System.BitConverter.ToInt16(BlockAlign, 0);
                byte[] BitsPerSample = new byte[2];
                fs.Read(BitsPerSample, 0, 2);
                this.BitsPerSample = System.BitConverter.ToInt16(BitsPerSample, 0);
            }

            if (sChunkID == "data")
            {
                // dataChunk: Save size in DataSize
                DataSize = System.BitConverter.ToInt32(ChunkSize, 0);

                // the first index of data specifies the audio channel, the 2. the sample
                Data = new int[this.NumChannels][];
                // temporary array for reading in bytes of one channel per sample
                byte[] temp = new byte[BlockAlign / NumChannels];
                // for every channel, initialize data array with the number of samples
                for (int i = 0; i < this.NumChannels; i++)
                {
                    Data[i] = new int[this.DataSize / (NumChannels * BitsPerSample / 8)];
                }

                // traverse all samples
                for (int i = 0; i &lt; Data[0].Length; i++)
                {
                    // iterate over all samples per channel
                    for (int j = 0; j < NumChannels; j++)
                    {
                        // read the correct number of bytes per sample and channel
                        if (fs.Read(temp, 0, BlockAlign / NumChannels) > 0)
                        {   // depending on how many bytes were used,
                            // interpret amplite as Int16 or Int32
                            if (BlockAlign / NumChannels == 2)
                                Data[j][i] = System.BitConverter.ToInt16(temp, 0);
                            else
                                Data[j][i] = System.BitConverter.ToInt32(temp, 0);
                        }
                        /* else
                         * other values than 2 or 4 are not treated here
                        */
                    }
                }
            }
        }
    }
}