HighTechTalks DotNet Forums  

StreamReader->regex->Win32 api marshalling

Dotnet Framework (Performance) microsoft.public.dotnet.framework.performance


Discuss StreamReader->regex->Win32 api marshalling in the Dotnet Framework (Performance) forum.



Reply
 
Thread Tools Search this Thread Display Modes
  #11  
Old   
Tim
 
Posts: n/a

Default Re: StreamReader->regex->Win32 api marshalling - 09-02-2005 , 04:54 AM






Well as I say the regex solution gives you some hope of being able to parse
the data, but I can't see any other way to do it. The other good thing about
using regex in messy cases like this is that any bits of text that don't get
recognised for whatever reason just don't get extracted - they don't
interfere with the parsing of any subsequent 'valid' rows of data.
The data would look something like ...

1|1/9/2005|Bruce|Dunwiddie|in here, I can have commas and
\r\n anything except the bar char thats the designated field delimiter for
this particular table "\r\n
2|28/7/2005|Bob|Jones|some other text

.... so you would look for the pattern of newline followed by an int follwed
by the field delimiter as the row terminator (not forgetting the end of the
file ofcourse!).
I think the only solution is to turbocharge the regex class or give it some
steroids or something. Or maybe put some limited optional regex functionality
into your csv parser? If there was some way to submit a regex expression as
the row delimiter in your csv parser that might be a good half-way solution?

"shriop" wrote:

Quote:
Can I see an example of one of these problematic rows? You're pretty
much out of luck from what you say. I saw a post a while back about
exporting events from the event viewer when the description contains
newline characters and the description is not quoted. I tried to find
something, anything, to base the logic off of. But if you can't be
certain of anything to deal with this file, then it's awful hard to
parse. This is one of the reasons why I try to push using my CsvWriter
class to create csv files, so people who don't know any better don't
try to come up with what they think a csv format is and end up leaving
out handling of newlines for example.



Reply With Quote
  #12  
Old   
shriop
 
Posts: n/a

Default Re: StreamReader->regex->Win32 api marshalling - 09-02-2005 , 10:44 AM






Because of the way the parser is built for performance, there's no way
to include regular expressions into the parsing. It'd have to be a
totally different version of the parser.

What you're describing possibly working with regular expressions has
nothing to do with regular expressions. You can, and I think should, do
the exact same thing just doing normal parsing. I'm fine with what
you're seeing as a pattern for a row delimiter. If you stayed with
using a normal csv parser, you could read in a row, and save the
results off somewhere. Then, read in the next row. If the first
column's value is not an integer, then you know that the first column's
value should be appended to the last column's value from the last row
and that you're still in the middle of a row. Rinse, and repeat.


Reply With Quote
Reply




Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.