February, 2011 Archives
Feb
Laptop vs. Desktop smackdown
by Mikael Lundin in Technicalities
Since the prices on RAM is all time low I seized the moment and doubled the amount in my developer machine from 4Gb to 8Gb. Let’s just say that 8Gb is the new 4Gb now.
Developing on a laptop or desktop?
I’m kind of biased about doing my development on a laptop or a desktop. I use a laptop at work and love the mobility of it, but feel very limited by the performance. My development desktop at home is heavy as hell, and not something you would move around, but it is the most stable computer I’ve ever owned.
The laptop
If we were to boil it down, what are the cons/pros for using a laptop in development.
| Pros | Cons |
|---|---|
| Mobility | Expensive performance |
| Built in UPS | Heat problems |
| Ergonomics, screen size | |
| No tinkering, hard to service | |
| Video and audio is crap |
When you sit in front of your computer for 8-10 hours straight you have to think about ergonomics. This usually means that external screen, keyboard and mouse are needed even though it’s all included in the laptop.
A laptop is much more use and throw away than desktop, because it is so hard to service. You won’t switch that CPU when it becomes too old and you won’t exchange motherboard for USB 3 support.
The Desktop
| Pros | Cons |
|---|---|
| Powerful hardware available | Stationary, you won’t move it around |
| Easier to cool down, not as noisy or warm | Takes up a lot of space |
| Cheaper in the long run, you switch components | |
| Customizable, you can add 2 hard drives if you want |
The largest pro here is that you can get much more raw power from your desktop and that can be upgraded as new components appear on the market. Waiting for compilation is never fun.
Summary
Would I rather work on a desktop or a laptop. Neither, both. I would love to have a solution where my workplace provided a desktop for development, and a laptop that is always synced with the desktop. When ever I needed to go out to a customer or into a meeting I could grab my laptop and know that it was up to date with my latest work. This could be done with dropbox or any similar solution.
This of course would make me a very expensive developer needing both 2 sets of hardware and 2 sets of licenses. But hey! employees salaries are expensive, not hardware.
Feb
ArrayList by units and objects in Pascal
by Mikael Lundin in Programming
The last programming task my father had in store for me was writing a record store library application in Turbo Pascal. I was 14 years old and managed to produce something that would accept new records, artists and store them in a list that could be saved to and loaded from disc. The result didn’t corrupt its database much more than SourceSafe corrupts its repository.
What if we would try to recreate some of that application.
ArrayList in Pascal by objects and units
Yes! Turbo Pascal did have object orienation as of version 5.5 no later than 1989. If we’re going to create a record library we should try to use the tools of abstraction available to us. By creating an ArrayList that abstracts the complexity of linked lists we could make the main program less complex.
By creating a unit (pascal library) we can keep our abstraction in there, and call to it from our main program. Here’s my implementation of the unit that contains the ArrayList.
unit Collections; interface type (* Linked List *) ListRef = ^List; List = record current : Pointer; (* Some object *) next : ListRef; end; (* Array List Object *) ArrayListRef = ^ArrayList; ArrayList = object first : ListRef; last : ListRef; constructor Init; destructor Done; procedure Add(el : Pointer); function Empty : boolean; function Head : Pointer; function Tail : ArrayListRef; end; implementation constructor ArrayList.Init; begin first := nil; last := nil; end; (* Free all elements from the linked list *) procedure DisposeAll(list : ListRef); var next : ListRef; begin if list <> nil then begin next := list^.next; Dispose(list); (* Free memory *) DisposeAll(next); end end; destructor ArrayList.Done; begin last := nil; DisposeAll(first); first := nil; end; (* Add element to the list *) procedure ArrayList.Add(el : Pointer); var item : ListRef; begin New(item); item^.current := el; if first = nil then first := item else last^.next := item; last := item; end; (* Is the list empty? *) function ArrayList.Empty : boolean; begin Empty := first = nil; end; (* First element in list *) function ArrayList.Head : Pointer; begin if Empty then Head := nil else Head := first^.current end; (* Rest of the list *) function ArrayList.Tail : ArrayListRef; var list : ArrayListRef; begin New(list, Init); list^.first := first^.next; list^.last := last; Tail := list; end; end.
- It is important that the name of the unit is the same as the name of the file.
- The interface declaration contains everything in the unit that should be visible to the calling program
- The ArrayList will contain an underlying linked list. In order to refrain from restricting the list to a specific type we store pointers to memory addresses in the list. This could be pointer to integers, strings, records or other objects.
- The object declaration looks pretty much like a record declaration, except it contains procedures and functions that belongs to the object.
- The constructor is the code that should be run on creation of the ArrayList.
- The destructure will clean up the memory that the list holds, when destructing the list. If we were just to dispose the ArrayList object, the underlying list would remain and hog up RAM, a so called memory leak.
- The implementation section of the unit is where we write the code. Here we can hide functions and procedures that we do not want to expose in the interface.
- DisposeAll will go recursively through the list and free memory of every item in it, before the whole object is disposed.
- ArrayList.Add is an instance method of the ArrayList that will take a pointer to an object and add it to the list. This is why we need the last-pointer that will remain pointing at the last element on the list.
- The Tail function will create a new ArrayList that will point to the next element in the list as its first element. This tail should be disposed after use, but not its contents since it is common with the main list.
Now we can use this list to store pointers to any kind of variable, record or object. In my program RecordStore I will use my ArrayList and add records to it.
program RecordStore;
uses Collections; (* Import the type ArrayList *)
type
RecordRef = ^RecordObject;
RecordObject = object
title : string;
artist : string;
constructor Create(titleInput, artistInput : string);
procedure Print;
end;
var records : ArrayListRef;
(* RecordObject instance procedures *)
constructor RecordObject.Create(titleInput, artistInput : string);
begin
title := titleInput;
artist := artistInput;
end;
procedure RecordObject.Print;
begin
Write(title);
Write(' - ');
Writeln(artist);
end;
(* Main program *)
procedure Add(title, artist : string);
var my_record : RecordRef;
begin
New(my_record, Create(title, artist));
records^.Add(my_record);
end;
(* Go through list and print every record *)
procedure Print(list : ArrayListRef);
var
tail : ArrayListRef;
current : RecordRef;
begin
if not list^.Empty then begin
current := list^.Head;
current^.Print; (* Print record *)
tail := list^.Tail;
Print(tail);
Dispose(tail); (* Disposes the tail, not the records in the list *)
end;
end;
begin
(* Create new list of records *)
New(records, Init);
Add('Dark side of the moon', 'Pink Floyd');
Add('The Rise and Fall of Ziggy Stardust and the Spiders from Mars', 'David Bowie');
Add('L.A. Woman', 'The Doors');
Print(records);
Dispose(records, Done);
end.
The possibilities here are endless.
- We import the previously created unit by name.
- We use an object to store the Record information. This object takes a title and artist in the constructor. It also have a procedure Print for writing the contents to stdout.
- Create a new RecordObject and add the pointer to the records list. This has now become trivial.
- Printing the list will now use the Head/Tail functionality and go through the list recursively and print Head. Just remember to dispose the tail (without calling done, because we don’t want to destroy the list).
It is code like this that makes me happy. You can download it all from my bitbucket repository or just download it as a zip. Enjoy!
Feb
Citerus programming challenge
by Mikael Lundin in F#, Programming
I’m very weak for programming challenges. Citerus held a contest where you could win an iPad if you would solve their problem. Since I’m a .NET developer, I’m not elegible for winning such a challenge but I love to meet a challenge just for the sake of it.
The Challenge
Given a string
VOCDIITEIOCRUDOIANTOCSLOIOCVESTAIOCVOLIOCENTSU
And a collection of abbreviations
TDD, DDD, DI, DO, OO, UI, ANT, CV, IOC, LOC, SU, VO
What is the shortest string you could produce by removing abbreviations from it?
Example
1) ...BIDDDOCDDD... (remove DDD) 2) ...BIDDDOC... (remove DDD again) 3) ...BIOC... (remove IOC)
The solution
This kind of problem is ideal for F#.
The main problem is that you just can’t remove abbreviations, because you won’t find the shortest string. You will have to try to remove abbreviations in all orders to find the shortest string produced. I do this by recursive calls, where every abbreviation removal is a path in the tree. I solve the problem by taking the branch with the shortest result.
// http://www.citerus.se/jfokus
module CiterusChallenge
// In the string
let s = "VOCDIITEIOCRUDOIANTOCSLOIOCVESTAIOCVOLIOCENTSU"
// Any occurrence of
let abbreviations = ["TDD"; "DDD"; "DI"; "DO"; "OO"; "UI"; "ANT"; "CV"; "IOC"; "LOC"; "SU"; "VO"]
// Should be removed
let rec remove (abbreviations : string List) (s : string) =
// Collect any version where one abbr is removed from s
// Filter out those with no effect on s
let collect =
abbreviations
|> List.map (fun abbr -> s.Replace(abbr, ""))
|> List.filter (fun short_s -> short_s.Length < s.Length)
// Select longest string of s1 and s2, or s1 if they're equal
let min (s1 : string) (s2 : string) =
if s1.Length <= s2.Length then s1 else s2
// Match result of abbreviations removal
match collect with
| [] -> s
| _ -> List.fold ( min ) s (collect |> List.map ( remove abbreviations ))
// Execute
let solution = lazy ( s |> remove abbreviations )
Utilities.measure_execution_time solution |> ignore
// Tests
let tests =
"HELLO" = ("HETDDLLDIO" |> remove abbreviations ) && // Two consecutive abbr
"HELLO" = ("HEDTDDILLO" |> remove abbreviations ) && // Two nested abbr
"HELLO" = ("HEDTDDILLO" |> remove abbreviations ) && // Removing TDD before DI
"HELIUM" = ("HELANTDDDIUM" |> remove abbreviations ) && // Don't remove DI or TDD, remove ANT then DDD
"" = ("DTDDIIOC" |> remove abbreviations ) && // Take TDD then IOC last DI will leave only empty string
"B" = ("BIDDDOCDDD" |> remove abbreviations) // Some wierd example from the problem description
Look, F# with syntax highlighting! All in all, the algorithm code is 11 LOC, if you remove the comments, and runs for 320 ms on my machine.
Feb
Beginners guide to Pascal
by Mikael Lundin in Programming
One of the strangest things is that most common search term for this blog is Turbo Pascal, that I’ve mentioned once (now twice) in my personal history of computing. I will attempt to honor that by posting a small beginners guide to Pascal.
All code presented here should be available for download at this link.
The toolset, compiler and editor
I will not be using the old Turbo Pascal 6.0 environment for these examples. It is not that accessible now as it was 15 years ago when I started to learn programming. Instead I recommend any text editor with syntax highlighting for pascal or delphi and Free Pascal compiler. This should be available for most operating system environments.

Your first program
I don’t know why you would like to learn Pascal. It is pretty much a dead language and only lives on in Delphi, which is also on the decline. I can only assume that you need to learn Pascal as a programming assignment at school.
If you like the syntax of Pascal and would like to find a similiar language more up to date I would suggest you’d look into Ada or Delphi. Both languages are much more up to date with todays standard and suitable for production use.
program HelloWorld;
begin
Writeln('Hello World!');
end.
- Name of the program, remember the semi-colon at the end (;)
- Main program listing begins here
- Write Hello World! to console output
- End the program listing, the main end has a dot (.) instead of a semicolon (;)
When I compile this I get one .o object file and one executable. If I run the executable I will get “Hello World!” written to the console window. Cool!
fpc src/helloworld.pas -obin/helloworld
This will compile the source code to bin/helloworld (use helloworld.exe if you’re in DOS32 environment).
Variables
A variable is an identifier that has a value. You use them to store values into memory for later processing. There are 6 main types in the Pascal programming language.
| Type | Example |
|---|---|
| integer | Numbers: 3, 34, 0, -345 |
| real | Numbers with decimals: 3.0, -0.05, 3.14 |
| char | Single characters: a, F, 2, ! |
| string | Multiple characters: Hello World!, Cheers!, My dog is Sam |
| boolean | True or false: true, false |
| pointer | more about this later |
First you declare a variable identifier and type, then you may assign a value to that variable, and them you’re ready to use the variable.
program Variables;
var width : integer;
var height : integer;
var area : integer;
begin
width := 12;
height := 4;
area := width * height;
Write('With the width ' +);
Write(width);
Write(' and the height ');
Write(height);
Write(' you have the area ');
Writeln(area);
end.
And you will get the expected output.
With the width 12 and the height 4 you have the area 48
If you have a value that never changes, you should use a constant instead of a variable. Here’s an example with the constant of PI.
program Constants;
const pi = 3.14;
var radius, area : real;
begin
radius := 5.0;
area := radius * radius * pi;
Write('With the radius ');
Write(radius:1:0);
Write(' we have the area ');
Writeln(area:6:2);
end.
- The strange notation is formatting. It means that we should use 1 space for the number and display 0 decimals.
- The same way this means, use 6 spaces for the number with 2 decimals.
The expected output is
With the radius 5 we have the area 78.50
Read input from user
You easily read input from user with Readln.
program ShoeSize;
var name : string;
var size : real;
begin
Write('What is your name: ');
Readln(name);
Write('What is your shoe size: ');
Readln(size);
Write('Hello '); Write(name);
Write(', your shoesize is '); Writeln(size:3:1);
end.
And the output looks like this. Yellow markings are input done by the user.
What is your name: Mikael What is your shoe size: 41.5 Hello Mikael, your shoesize is 41.5
Control flow
Time to take a look at if-then-else and looping constructs in Pascal.
program GuessTheNumber;
const Max = 1000;
var number, guess, guesses : integer;
begin
Writeln('The number is between 1-1000.');
(* Initialize variables *)
number := Random(Max) + 1;
guess := -1;
guesses := 0;
while guess <> number do begin
Write('Guess the number: ');
Readln(guess);
if guess > number then
Writeln('Your guess was too high')
else
Writeln('Your gess was too low');
guesses := guesses + 1;
end;
Write('You found the number in ');
Write(guesses);
Writeln(' tries.');
end.
- Rand(Max) will randomize a number of 0-999. We add 1 to get a random number of 1-1000.
- The while loop will not exit until guess is equal to number. Begin marks the beginning of a code block.
- Runs the next expression if guess is larger than number, otherwise runs the else case.
- Notice that semicolon (;) is missing when there is an else case.
The expected output is
The number is between 1-1000. Guess the number: 500 Your gess was too low Guess the number: 800 Your guess was too high Guess the number: 750 Your guess was too high Guess the number: 650 Your guess was too high Guess the number: 550 Your guess was too high Guess the number: 525 Your gess was too low Guess the number: 535 Your gess was too low Guess the number: 545 Your gess was too low Guess the number: 548 Your gess was too low Guess the number: 549 Your gess was too low You found the number in 10 tries.
As you notice, the code will tell us that “Your guess was too low” even when we’re spot on. How do we fix that bug?
For-loop
The for loop starts at a number and counts it up until it reaches a max.
program Pyramid;
var i, j : integer;
var height : integer;
begin
Write('Height of pyramid: ');
Readln(height);
for i := 1 to height do begin
(* Write empty spaces before building blocks *)
for j := 1 to (height - i) do
Write(' ');
(* Write building blocks *)
for j := 1 to (i + (i - 1)) do
Write('*');
Writeln();
end;
end.
- Will loop “height” number of times. First time i will be 1, second time it will be 2 and so on. Notice the begin that marks beginning of a block that ends on line 19.
- It’s ok to have inner for loops and to use arithmetic expressions to calculate upper bound.
The output of this should be
Height of pyramid: 5
*
***
*****
*******
*********
And we could also count down, with the keyword downto.
program Factorial;
var i, n, result : integer;
begin
Write('Calculate factor of: ');
Readln(n);
result := 1;
for i := n downto 1 do
result := result * i;
Write(n); Write('! = '); Writeln(result);
end.
The output of this would be
Calculate factor of: 5
5! = 120
Case-else can be quite useful for building menu options.
program Menu;
var input : integer;
begin
(* Initialize variables *)
input := 0;
while input <> 5 do begin
Writeln('Welcome, please select any of the following');
Writeln('1. Vegetables');
Writeln('2. Fruit');
Writeln('3. Gardening tools');
Writeln('4. Electronic equipment');
Writeln('or 5 to quit');
Write('> ');
Readln(input);
case input of
1..2 : Writeln('You selected food');
3, 4 : Writeln('You selected a tool');
5 : Writeln('Thank you, and welcome back')
else
Writeln('Error: Unrecognized option');
end;
end;
end.
- You can scoop up cases based on ranges of values.
- or you can use discrete values
- Notice that this line has no semicolon ending, because there is an else on the next line.
- The case statement has an end; even without begin.
Part of the output looks like this
Welcome, please select any of the following
1. Vegetables
2. Fruit
3. Gardening tools
4. Electronic equipment
or 5 to quit
> 88
Error: Unrecognized option
String and arrays
Strings are arrays of characters. You can specify the length of the string when you define them, or leave the length out and have it set to 255. Who needs longer strings than that anyway?
program Encrypt;
const Alphabet = 25;
var data, result : string; (* char array of 255 characters *)
var encKey, i : integer;
var encChar : char;
begin
Write('Enter the encryption key: ');
Readln(enckey);
Write('Enter the phrase to encrypt (capital letters): ');
Readln(data);
result := '';
for i := 1 to Length(data) do begin
(* Add the encryption key digit to all characters *)
encChar := Chr(((Ord(data[i]) + encKey) mod Alphabet) + Ord('A'));
result := result + encChar;
end;
Write('Encrypted phrase: ');
Writeln(result);
end.
- The Length(data) function will return the length of the array, or string in this case.
- Ord(c) will get ascii number for the character, mod is the modulus operator and Chr(i) will get the char for that ascii number. The result is an encrypted character.
I assume that you did recognize the ceasar cipher used. Not very strong, but works well on the feeble minded.
Enter the encryption key: 1337 Enter the phrase to encrypt (capital letters): SECRET Encrypted phrase: UGETGV
Arrays
You can create a new array by saying var [name] : array[x..y] of [type]. Here’s an example of the fibonacci calculation.
program Fibonacci;
const Max = 10;
var numbers : array[1..Max] of integer;
var i : integer;
begin
numbers[1] := 1;
numbers[2] := 2;
(* Calculate fibonacci sequence *)
for i := 3 to Max do
numbers[i] := numbers[i - 2] + numbers[i - 1];
(* Print out the sequence *)
Write('Fibonacci: ');
for i := 1 to Max do begin
Write(numbers[i]);
Write(' ');
end;
Writeln;
end.
- The Max constant used for top limit of the array could as easily been a number. I used a constant here for reuse in both for statements.
And the expected printout as follows.
Fibonacci: 1 2 3 5 8 13 21 34 55 89
Procedures
A procedure is some piece of code that you might want to cut out and call several times.
program Procedures;
const Width = 33;
procedure Title;
begin
Writeln('*** PASCAL WILL RULE THE WORLD ***');
end;
procedure Separator;
var i : integer;
begin
for i := 0 to Width do
Write('*');
Writeln;
end;
(* Main program begins here *)
begin
Separator;
Title;
Separator;
end.
- Notice that this constant is global, which means that it can be reached within a procedure, example at line 12.
- This variable i is local for the procedure, which means that it will be destroyed when the execution ends the procedure.
Expected output
********************************** *** PASCAL WILL RULE THE WORLD *** **********************************
You can send arguments to a procedure as I do with the Swap procedure below. I’m using the numbers array as a global variable. Shame on me.
program BubbleSort;
const Max = 20;
var numbers : array[1..Max] of integer;
(* Randomize digits in the array *)
procedure Randomize;
var i : integer;
begin
for i := 1 to Max do
numbers[i] := Random(100);
end;
(* Swap two values in the array *)
procedure Swap(x, y : integer);
begin
numbers[x] := numbers[x] xor numbers[y];
numbers[y] := numbers[x] xor numbers[y];
numbers[x] := numbers[x] xor numbers[y];
end;
procedure Sort;
var i, j : integer;
begin
for i := 1 to Max - 1 do begin
for j := i + 1 to Max do begin
if numbers[i] > numbers[j] then
Swap(i, j);
end;
end;
end;
procedure Print;
var i : integer;
begin
for i := 1 to Max do begin
Write(numbers[i]);
Write(' ');
end;
Writeln;
end;
(* Main program starts here *)
begin
Randomize;
Sort;
Print;
end.
- You will recognize the infamous XOR swap algorithm here. We define two arguments with the type integer.
- We call the Swap procedure with the arguments i, and j which are positions in the array that needs swapping.
The expected output
5 27 29 38 38 42 43 47 54 54 59 60 62 64 71 84 84 85 89 96
Functions and recursion
The difference with functions and procedures is that functions will return a value which makes it usable for recursion, i.e. calling itself.
Following function uses recursion to do a binary search algorithm on a sorted list. Complexity should me O(n log n).
program Find;
const Min = 1;
const Max = 10;
type Vector = array[Min..Max] of integer;
var guess : integer;
function CreateVector() : Vector;
var v : Vector;
begin
v[1] := 27; v[2] := 29; v[3] := 38; v[4] := 42; v[5] := 43;
v[6] := 47; v[7] := 54; v[8] := 59; v[9] := 60; v[10] := 62;
CreateVector := v;
end;
function Exists(min, max, search : integer; v : Vector) : boolean;
var middle : integer;
begin
(* Found *)
if (search = v[min]) or (search = v[max]) then
Exists := true
(* Not found *)
else if max - min < 2 then
Exists := false
(* Keep looking *)
else begin
middle := min + Trunc((max - min) / 2);
if (search >= v[middle]) then
Exists := Exists(middle, max, search, v)
else
Exists := Exists(min, middle - 1, search, v);
end;
end;
(* Main program starts here *)
begin
Write('Test if a number is in vector: ');
Readln(guess);
Writeln(Exists(Min, Max, guess, CreateVector()));
end.
- I create a type alias for the array. It will be easier reference to it in function calls.
- A function that should return a Vector
- You return a value by giving the function the value you want to return.
- min, max and search are arguments of integer, and v is of Vector. The return result is true/false that indicates if value is found.
- Recursive call to the same function that is running.
Test if a number is in vector: 38
TRUE
Records and pointers – linked lists
Records can be used to bundle primitives together and the most useful combination is with pointers. As you can use pointers from one record to another record and create linked lists. In the following piece of code I will use the sieve of Eratosthenes to produce the first 100 primes.
program Primes;
type
ListRef = ^List;
List = record
current : integer;
next : ListRef;
end;
var result : ListRef;
(* Build a list of integers ranging from min to max *)
function Range(min, max: integer) : ListRef;
var list : ListRef;
begin
New(list);
if min = max then
list^.current := max
else begin
list^.current := min;
list^.next := Range(min + 1, max);
end;
Range := list;
end;
(* Filter all the specific factors from the list *)
function FilterFactors(numbers : ListRef; factor : integer) : ListRef;
var next : ListRef;
begin
if numbers = nil then
FilterFactors := numbers
else if numbers^.current mod factor = 0 then begin
next := numbers^.next;
numbers^.next := nil;
Dispose(numbers);
FilterFactors := FilterFactors(next, factor);
end
else begin
numbers^.next := FilterFactors(numbers^.next, factor);
FilterFactors := numbers;
end;
end;
(* Remove all numbers that aren't primes *)
function FilterPrimes(numbers : ListRef) : ListRef;
begin
if numbers <> nil then begin
numbers^.next := FilterFactors(numbers^.next, numbers^.current);
numbers^.next := FilterPrimes(numbers^.next);
end;
FilterPrimes := numbers;
end;
(* Print the list *)
procedure Print(numbers : ListRef);
begin
if numbers <> nil then begin
Write(numbers^.current);
Write(' ');
Print(numbers^.next);
end;
Writeln;
end;
(* Main program starts here *)
begin
result := FilterPrimes(Range(2, 100));
Print(result);
end.
- Alias the List pointer type to ListRef
- Define a record of an integer and a pointer to next item in the linked list
- Create space on the heap for a new list item
- Clear the memory that the pointer points to, to be used by other programs
The output is as expected
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
Read on…
I’m impressed that you read this far. It must mean that you have almost as perverted mind as I have (that spent a whole Sunday writing Pascal examples). I could continue and tell you about writing and reading files from disc. I could go into graphics programming or object oriented programming with Pascal.
I won’t. Sorry. But I have uploaded all of my examples so that you can run them yourself in your favorite compiler. Have fun!
Update: This guide now has a follow up on units and objects.
Feb
Crash course in regular expressions
by Mikael Lundin in Programming
So we were discussing regular expressions and how it can be completely unreadable, but I really love this language. Mostly because it is a one purpose language, string matching, and it serves that purpose extremly well.
Here’s a crash course in regular expressions.
Rad Software – Regular expression designer
This is such a simple tool and yet so useful. By far my favorite regular expression designer out there. Download the 200k install and rock on!

Pattern matching
Consider the following example. The pattern matches everything marked in yellow.
# match a substring String: Uncle is my mother's brother Pattern: other
It will match “other” twice in the sentence. That’s easy enough but not very useful.
\w means word character and \s means whitespace. What do we write if we would like to match words?
# match words String: The quick brown fox jumps over the lazy dog Pattern: \w+
This makes a very inefficient string splitter.
# match one or more word character |or| dash String: Lars-Åke Bengtsson Pattern: [\w-]+
The beginning of the string matches with ^ and the end of the string as $. You can also use a|b for matching a or b.
# match first and last word of string String: Luke, I am your father Pattern: ^\w+|\w+$
Watch out, because $ could mean end of string or end of line, depending on the options you send into the regex parser
Match groups
A match group should be something that you would like to distinguish from other matches.
# Distinguish element name String: <b>I can haz hatz!</b> Pattern: </?(\w+)> Matching group #1: <b>I can haz hatz!</b>
The questionmark `?` indicates that the preceeding character could exist in the match, but does not have to.
We can name the matching groups like this.
# get parts of an e-mail address String: spam@litemedia.se Pattern: (?<username>.+)@(?<server>.+) Matching group #username: spam@litemedia.se Matching group #server: spam@litemedia.se
The dot `.` matches anything. .+ means, match something at least once.
Lazy and greedy
Consider the following match where we would like to find the format string in the expression.
# Get the first argument, format string in the expression
String: string.Format("I would like some {0}", "Bananas");
Pattern ".*"
The pattern matches everything up to the last qoute, where as we only would like it to match up to the first quote. This is because * is greedy by default. We can change it to lazy with a questionmark.
# Get the first argument, format string in the expression
String: string.Format("I would like some {0}", "Bananas");
Pattern ".*?"
Look-ahead and look-behind
You can match things that come before another expression, or after.
# Find first letter of sentence (positively look-behind) String: Tree you are. Moss you are. You are violets with wind above them. Pattern: (?<=^|\.\s*)\w
Word character that comes first in the string or after a dot and some whitespace.
# Any digit not before a 1 String: 11011001 pattern: \d(?!1)
Backreference
You can reference to a previously defined group.
# Match content within tags
String: <q>Hardware: The parts of a computer </system> that can be kicked.</q>
Pattern: (?<=<(?<el>\w+)>).*?(?=</\k<el>)
Matching group #el: q
The content should be preceeded by an opening tag and followed by a closing tag with the same element name.
Usage in C#
This is how you would use a regular expression in C#.
var expression = new Regex(@"(?<=<(?<el>\w+)>).*?(?=</\k<el>)");
var data = "<li>Tulip</li><li>Lily</li><li>Duffydil</li>";
var matches = expression.Matches(data);
foreach (Match match in matches)
{
Console.WriteLine("Flower: {0}", match.Value);
}
This will print out all the flower names to the console window.
One of the best resources for regular expressions I’ve found is regular-expressions.info. Now, go along and have fun!
