Well as I say the regex solution gives you some hope of being able to parse
the data, but I can't see any other way to do it. The other good thing about
using regex in messy cases like this is that any bits of text that don't get
recognised for whatever reason just don't get extracted - they don't
interfere with the parsing of any subsequent 'valid' rows of data.
The data would look something like ...
1|1/9/2005|Bruce|Dunwiddie|in here, I can have commas and
\r\n anything except the bar char thats the designated field delimiter for
this particular table "\r\n
2|28/7/2005|Bob|Jones|some other text
.... so you would look for the pattern of newline followed by an int follwed
by the field delimiter as the row terminator (not forgetting the end of the
file ofcourse!).
I think the only solution is to turbocharge the regex class or give it some
steroids or something. Or maybe put some limited optional regex functionality
into your csv parser? If there was some way to submit a regex expression as
the row delimiter in your csv parser that might be a good half-way solution?
"shriop" wrote:
Quote:
Can I see an example of one of these problematic rows? You're pretty
much out of luck from what you say. I saw a post a while back about
exporting events from the event viewer when the description contains
newline characters and the description is not quoted. I tried to find
something, anything, to base the logic off of. But if you can't be
certain of anything to deal with this file, then it's awful hard to
parse. This is one of the reasons why I try to push using my CsvWriter
class to create csv files, so people who don't know any better don't
try to come up with what they think a csv format is and end up leaving
out handling of newlines for example. |