• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Re: Remove extra Double codes from CSV Files (vb Script Language)


Christoph Basedau

Learn4Develop schrieb:

> On daily bases I have received files with double quotes comma separated, but
> in some records I found 2 times double quotes and some time different squence
> of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the
> following values where i want to remove extra double quotes.
> "0123x",""Company D-Val"","Class D, sector N","DD5894"
> "4894D",""Recycle" Rubbish, C class","Class D, Sector F, Block N","D870GH"
> "AB8679",""AB Ltd"",""Need" Clean Drive Way, stores","GF0347"
> I am looking the output like that.
> "0123x","Company D-Val","Class D, sector N","DD5894"
> "4894D","Recycle Rubbish, C class","Class D, Sector F, Block N","D870GH"
> "AB8679","AB Ltd","Need Clean Drive Way, stores","GF0347"
If you want to implement the appropriate algorithm, you cannot
use a replace-mechanism based on literals, you always have to be
aware of the quoting and delimiting context.
The right way, j2mc, is to parse the string char by char, count the
DQs and check whether a DQ is followed by a delimiter.
If the delimiter appears after an even number of DQs it truly terminates
the token, if the number is odd it's an "inline-delimiter".
A DQ on the other hand is kept if its the first one (obvious,
as opener) or one with an even ordinal number followed by the delimiter


In VBS a function that follows these rules would like this:

Function unquote(record)

Const DQ = """"
Const CM = ","

Dim newRecord
Dim dqCount
Dim char
Dim nextChar
Dim i
Dim keep

newRecord = ""
dqCount = 0

For i = 1 To Len(Record)

char = Mid(record, i, 1)
nextChar = Mid(record, i+1, 1)
keep = 0

If char = DQ Then
If dqCount = 0 Then
'beginning of token
keep = 1
dqCount = 1

ElseIf (dqCount Mod 2 = 1) And _
(nextChar = CM OR nextChar = "") Then

'end of token marked by ", OR "\r\n
keep = 1
dqCount = 0
'inline "
keep = 0
dqCount = dqCount + 1
End If
'char other then "
keep = 1
End If
If keep = 1 Then
newRecord = newRecord & char
End If
'WSH.Echo dqCount, i, keep, char, nextchar, newRecord
unquote = newRecord
End Function

To test the results, run:

Option Explicit
Dim records
records = Array ( _
"""0123x"",""""Company D-Val"""",""Class D, sector N"",""DD5894""" _
, """4894D"",""""Recycle"" Rubbish, C class"",""Class D, Sector F, Block
N"",""D870GH""" _
, """AB8679"",""""AB Ltd"""",""""Need"" Clean Drive Way,
stores"",""GF0347""" _
, """""AB8679"", Test New"",""""""AB Ltd"""""",""""Need"" Clean Drive
Way, stores"",""GF0347""" _

Dim Record

For Each Record in records
WSH.Echo record
WSH.Echo unquote(record)

Function unquote(record)
End Function

My Computer

Users Who Are Viewing This Thread (Users: 1, Guests: 0)