C# Tutorial - Dissecting Our Second Application – Strings


The parameters passed into the application take the form of strings. When you put characters together, they make words, phrases, and sentences. In programming, a group of characters is called a string. A string can be identified because it is contained between a set of double quotes.

string is an alias for the .NET Framework System.String class which represents an immutable string of characters - immutable because the value within the string cannot be modified once it has been created. Methods that modify a string actually return a new string containing the modified version. Because of its immutability, a string is known as a string literal.

A regular string literal consists of zero or more characters enclosed in double quotes, as in "hello", and may include both simple escape sequences (such as \t for the tab character), and hexadecimal and Unicode escape sequences.

A verbatim string literal consists of an @ character followed by a double-quote character, zero or more characters, and a closing double-quote character. In a verbatim string literal, the characters between the delimiters are interpreted verbatim, the only exception being a quote-escape-sequence. In particular, simple escape sequences, and hexadecimal and Unicode escape sequences are not processed in verbatim string literals.

In the following example, string ‘a’ is a regular string literal and ‘b’ is a verbatim string literal.

string a = "hello \t world";
string b = @"hello \t world";

String ‘a’ outputs a tabbed space within it, while ‘b’ outputs the escape sequence.

hello 	 world
hello \t world

A verbatim string literal may span multiple lines. For example:

string j = @"one
two
three";

output:

one
two
three

The new lines in the verbatim string literal appear in the output.

Escape Sequences

An escape sequence allows special characters to be entered into a string.

A simple-escape-sequence is one of the following:

\'  \"  \\  \0  \a  \b  \f  \n  \r  \t  \v

A hexadecimal-escape-sequence has the format:

\x   hex-digit   hex-digitopt   hex-digitopt   hex-digitopt

A character that follows a backslash character (\) in a string must be one of the following characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v. Otherwise, a compile-time error occurs.

A hexadecimal escape sequence represents a single Unicode character, with the value formed by the hexadecimal number following "\x".

A Unicode character escape sequence in a character must be in the range U+0000 to U+FFFF, otherwise a compile-time error occurs.

A simple escape sequence represents a Unicode character encoding, as described in the table below.

Escape Sequence Character Name Unicode Encoding

\'

Single quote

0x0027

\"

Double quote

0x0022

\\

Backslash

0x005C

\0

Null

0x0000

\a

Alert

0x0007

\b

Backspace

0x0008

\f

Form feed

0x000C

\n

New line

0x000A

\r

Carriage return

0x000D

\t

Horizontal tab

0x0009

\v

Vertical tab

0x000B

Declaring Strings

A string can be declared in any of the following ways:

string a;
string b = null;
string c = "";
string d = "C# Rocks!";

Where:

  • ‘a’ is a null string. Null means that the string does not reference any object.
  • ‘b’ is a null string
  • ‘c’ is an empty string
  • ‘d’ contains the words "C# Rocks!"

<< Previous Contents Next >>

© Publicjoe, 2008