Regular expressions are used to find matches in texts. The following is a real application of Regex in C# and Java.
CSV are files that all the data is separated by a comma. E.g:
name,line1,line2,city,zip code,country
You cand easily use String.Split() in C# to get all the values. But, there are cases when the data can contain comma. E.g:
"Mr. John Doe, Jr.",7926 Glenbrook Dr., 14623
In this case a regular expression (regex) could be use to determine if the comma is inside a quote or not.
C# Example:
public string[] parseCSV(string line)
{
List<string> datalist = new List<string>();
/*
* Define a regular expression for csv.
* This Pattern will match on either quoted text or text between commas, including
* whitespace, and accounting for beginning and end of line.
*/
Regex rx = new Regex("\"([^\"]*)\"|(?<=,|^)([^,]*)(?:,|$)",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
// Find matches.
MatchCollection matches = rx.Matches(line);
// Report the number of matches found.
Console.WriteLine("{0} matches found.", matches.Count);
// Report on each match.
foreach (Match match in matches)
{
if (match.Groups[1].Value.Length > 0)
datalist.Add(match.Groups[1].Value); // match csv values inside commas
else
datalist.Add(match.Groups[2].Value); // match csv values outside commas
}
return datalist.ToArray();
}</pre>
</div>
<div>
</div>
<div>
</div>
<div>
Java Example:</div>
<div>
<pre>
public String[] parse(String csvLine) {
Pattern csvPattern = Pattern.compile("\"([^\"]*)\"|(?<=,|^)([^,]*)(?:,|$)");
matcher = csvPattern.matcher(csvLine);
allMatches.clear();
String match;
while (matcher.find()) {
match = matcher.group(1);
if (match!=null) {
allMatches.add(match);
}
else {
allMatches.add(matcher.group(2));
}
}
size = allMatches.size();
if (size > 0) {
return allMatches.toArray(new String[size]);
}
else {
return new String[0];
}
} </pre>
</div>
Now, your turn!
Thanks for reading this far. Here are some things you can do next:- Found a typo? Edit this post.
- Got questions? comment below.
- Was it useful? Show your support and share it.