Excel is fundamentally broken.

gordonjcp

I have an auto-generated CSV file from a bit of monitoring equipment. $office_folk want it as an Excel file, because they can only think in Excel.

One column contains a value consisting of the shelf and port number of a connection, which looks like "3-2" or "5-44". There appears to be no way to tell Excel to treat that as a text stringm, and not try to format it as a date.
Letting Excel load it and turning it back to a text column just turns it into a random number.

But that's okay! We have paid a shitload of money for Microsoft support! What do they have to say? Paraphrasing slightly "lol it looks like a date zomg don't make it look like a date".

What a bag of shite.

Jaloopa

Open Excel
Data tab
Choose external data from text
Choose the file
Set the delimiters
On the last step of the wizard, select every column (or every column with this format and change it from General to Text
???
Profit

0_1527253040225_fac830ce-06c0-4cde-b69c-ab5ad6180cc4-image.png

Khudzlin

@gordonjcp Auto-formatting in Excel is really annoying. Try putting a single quote at the start of each value to tell Excel to leave it the fuck alone (that single quote won't be displayed).

boomzilla

@khudzlin said in Excel is fundamentally broken.:

Try putting a single quote at the start of each value to tell Excel to leave it the fuck alone (that single quote won't be displayed).

Yes. This is the magic "treat this as a string" Excel incantation.

gordonjcp

@boomzilla Tried that, now I have a date format with a single quote in front.

gordonjcp

@khudzlin Tried that, and various permutations around that, and I mostly get a date format with assorted punctuation around it. Sometimes I get a five-digit number.

anonymous234

Don't call Excel that. I believe the preferred term is "special needs"

blakeyrat

@gordonjcp Just do a manual import and define your column types instead of having Excel guess.

HoviWan

Create a false .csv file with the same number of columns. put text data in the first row, and save.
With a blank worksheet in Excel, us the Data-Get & Transform date-From Text/CSV function to import the false .csv file. It will import as text.
Save as a worksheet & close.
Save the real .csv file over the false .csv file
Re-open the worksheet, click on a data field, select Refresh in the Table Tools Ribbon.
Automate, as desired.

Ben_Warre

GitHub - ulich/csv2xls: A python script that converts a csv file to Excel format (.xlsx)

A python script that converts a csv file to Excel format (.xlsx) - ulich/csv2xls

Sure there are other ways out there.

ObjectMike

I thought putting values in double quotes was the CSV way of saying "string, us as-is". Does Excel try to interpret these kinds of values.

heterodox

@mikehurley said in Excel is fundamentally broken.:

I thought putting values in double quotes was the CSV way of saying "string, us as-is". Does Excel try to interpret these kinds of values.

It sure does. -_-

JBert

@mikehurley said in Excel is fundamentally broken.:

I thought putting values in double quotes was the CSV way of saying "string, us as-is". Does Excel try to interpret these kinds of values.

CSV doesn't care about types, those quotes merely indicate that the following characters might contain newlines or commas.

Tsaukpaetra

@jbert said in Excel is fundamentally broken.:

@mikehurley said in Excel is fundamentally broken.:

I thought putting values in double quotes was the CSV way of saying "string, us as-is". Does Excel try to interpret these kinds of values.

CSV doesn't care about types, those quotes merely indicate that the following characters might contain newlines or commas.

And double-double-quotes indicate a single double-quote, and a double-double-double-quotes indicates two double-quotes...

Edit, wait, no, I think that would be three double-quotes? Maths is hard.

Scarlet_Manuka

Darn it, now you're making me log into my work account so I can see what we use to get around this.

(five minutes later)

Looks like what you want is this:

="5-44"

Starting with = makes it into a formula, a simple one which just returns a constant string.

gordonjcp

@blakeyrat Sounds good, how do you do that?

gordonjcp

@scarlet_manuka That still gets parsed into a date.

hungrier

@tsaukpaetra said in Excel is fundamentally broken.:

double-double-quotes

That's when you're quoting someone's Tim Hortons order.

Scarlet_Manuka

@gordonjcp said in Excel is fundamentally broken.:

@scarlet_manuka That still gets parsed into a date.

Dunno what to tell you. It works in our CSVs.

TimeBandit

@gordonjcp said in Excel is fundamentally broken.:

how do you do that?

Install LibreOffice Calc
Open your CSV with LO Calc
You'll get the text import screen
Click on the column header and select "Text" for the column type
Click OK
Click File -> Save As...
Select File type as "Microsoft Excel 2007-2013 XML (.xlsx) then click Save

Enjoy your new Excel file with proper column types

Jaloopa

@timebandit said in Excel is fundamentally broken.:

@gordonjcp said in Excel is fundamentally broken.:

how do you do that?

Install LibreOffice Calc

Open your CSV with LO Calc

You'll get the text import screen

Click on the column header and select "Text" for the column type

Click OK

Click File -> Save As...

Select File type as "Microsoft Excel 2007-2013 XML (.xlsx) then click Save

Enjoy your new Excel file with proper column types

Or do the same thing with excel like I said in the first reply

TimeBandit

@jaloopa said in Excel is fundamentally broken.:

Or do the same thing with excel like I said in the first reply

Your solution is better, mine doesn't have a "profit" step

gordonjcp

@timebandit Okay, so if I could install LibreOffice then I would, but for various reasons that's not a viable option.

Kurt.C.Pause

@gordonjcp

Ah yes, the old helpful excel autoformat! now with data loss!

repost from myself from the other thread:

If Microsoft wanted to Improve Excel (and I should know, I have designed 0 spreadsheet applications) they should split each worksheet in three layers:

Data
Formatting
Formulas / Code

The amount of Data loss due to excel helpfully stripping leading 0, or saving long numbers in Scientific Notations (E+29) I have witnessed (in some shitty one time ETL processes for SAP) is staggering.

what's more:
I still want to find the person responsible in removing english Function Names as a possibility in translated Excel versions; and let him meet my clue-by-four.

Vlookup => SVerweis
Len => Länge =>

This gets converted to english (I assume) anyways, cause the Formulas saved in EN-Excel work in DE-Excel and vice-versa.

topspin

@kurt-c-pause said in Excel is fundamentally broken.:

I still want to find the person responsible in removing english Function Names as a possibility in translated Excel versions; and let him meet my clue-by-four.
Vlookup => SVerweis
Len => Länge =>

That, and fucking up imports by interpreting files according to the user's current display locale.
Your CSV file contains the line 3.1415 ; 2.7183? Surely, with your German locale displaying decimal separators as commas, you want to import this as 31415, 27183. Because file contents should change according to display settings.

Gąska

@kurt-c-pause said in Excel is fundamentally broken.:

I still want to find the person responsible in removing english Function Names as a possibility in translated Excel versions; and let him meet my clue-by-four.

At least they provide a translation tool.

Onyx

I am of a firm opinion that every piece of software needs a Stop helping! checkbox.

pie_flavor

@onyx except for Oracle software, which needs a Start helping checkbox.

Onyx

@pie_flavor said in Excel is fundamentally broken.:

@onyx except for Oracle software, which needs a Start helping checkbox.

It already changes empty strings to NULLs, I don't know what more you could ask for!

Ben_Warre

@scarlet_manuka @gordonjcp and on my test here. Excel 2016.

Carnage

@onyx said in Excel is fundamentally broken.:

@pie_flavor said in Excel is fundamentally broken.:

@onyx except for Oracle software, which needs a Start helping checkbox.

It already changes empty strings to NULLs, I don't know what more you could ask for!

Speaking of this null in cohorts with strings, the product I'm currently getting away from has a nifty duoble take on null strings bug.
When you update a thingymabob you enter a text into a "reason or whatever" field, that is concatenated into a "history" text.
When you do not enter anything, that empty string is somewhere along the line interpreted as null, and then again, between this and when it gets concatenated into the text string, a null string gets interpreted as the string "null". So there are a bunch of "nullnullnnull" in the history text.
Oracle is involved, but I don't think that oracle is where stuff goes herpaderp.

dkf

@carnage said in Excel is fundamentally broken.:

Oracle is involved, but I don't think that oracle is where stuff goes herpaderp.

What an unexpected twist!

Carnage

@dkf said in Excel is fundamentally broken.:

@carnage said in Excel is fundamentally broken.:

Oracle is involved, but I don't think that oracle is where stuff goes herpaderp.

What an unexpected twist!

In a system that uses Excel as a data transfer protocol between services internally, to save dev time, Oracle is actually pretty sane.

Ben_Warre

From the bottom of this SO question:

Stop Excel from automatically converting certain text values to dates

Does anyone happen to know if there is a token I can add to my csv for a certain field so Excel doesn't try to convert it to a date? I'm trying to write a .csv file from my application and one of ...

Add a tab in front of your data. Excel interprets it as text, but doesn't display anything.

Of course, since your monitoring data is creating the file, the point is moot, as you're always going to need to run the file through a filter before distributing it. Unless the monitoring kit has some options in it.

anotherusername

The .xlsx format is not excessively complex... it may be possible to make a basic conversion tool...

Cabbage

See also http://georgemauer.net/2017/10/07/csv-injection.html. If you're using CSVs with Excel, it's worth knowing that opening a CSV containing untrusted data from user input is a remote code execution vulnerability, because opening a CSV containing a cell value like "=2+5+cmd|' /C calc'!A0" will run a command line command (in this case, opening Calculator). This vulnerability is completely unavoidable regardless of how you format your CSV.

"That's bullshit," says an Excel apologist, "you can put a ' at the start of every field and then Excel will treat the cell as a literal string and not attempt to parse it in any clever way." Sure, that works. Unless the user who opened in Excel hits Save and then opens the file again. Because, y'see, Excel removes those leading quotes on save, thus changing the meaning of the underlying data and reintroducing the code execution vulnerability. Do you 100% trust your backoffice staff that you're giving this CSV file to to understand this and never, ever open, edit, and re-save a CSV within Excel? If not, then you can't safely give them a CSV containing untrusted user-provided data, ever.

Yes, all this is completely fucking retarded.

dkf

@cabbage said in Excel is fundamentally broken.:

Do you 100% trust your backoffice staff

Hohohoho!

No.

Cabbage

@dkf said in Excel is fundamentally broken.:

@cabbage said in Excel is fundamentally broken.:

Do you 100% trust your backoffice staff

Hohohoho!

No.

Yeah, Santa, if I were you I wouldn't trust those fucking elves, either. I'm just racist like that.

pie_flavor

@cabbage said in Excel is fundamentally broken.:

@dkf said in Excel is fundamentally broken.:

@cabbage said in Excel is fundamentally broken.:

Do you 100% trust your backoffice staff

Hohohoho!

No.

Yeah, Santa, if I were you I wouldn't trust those fucking elves, either. I'm just racist like that.

Bright (2017) ⭐ 6.3 | Action, Fantasy, Thriller

1h 57m | TV-MA

topspin

@cabbage Holy fucking fuck!!

dfdub

@cabbage Holy shit, TIL. Does it at least not do that in protected mode?

anotherusername

I am probably certifiably insane now, but I made a thing.

csvtoxlsx.js:

// Drag-and-drop one or several CSV files and/or folders onto this .js file; non-CSV files will be ignored
// If a folder is dropped, the script will automatically process all CSV files in the folder (but not subfolders)
// Each CSV file will be converted to an XLSX file, without any type conversion (all values treated as Text)
// If an XLSX file already exists with the same name as a CSV file, the CSV file will be skipped

// These default values are what you'll typically need for parsing Windows-based CSV files...
// (note: the CSV parser should handle multi-character delimiters just fine)
var CSV_ext = 'csv';              // CSV file extension (case-insensitive)
var CSV_ln = '\r\n';              // row delimiter (newline is '\r\n' - other line endings will be converted)
var CSV_delim = ',';              // column delimiter (may need to be ';' in some locales)
var CSV_quote = '"';              // quote delimiter
var CSV_escaped_quote = '""';     // this sequence unescapes back to an embedded quote delimiter
//var CSV_escaped_quote = '\\"';    // you'd use this for \" escaped quotes

// The XLSX container itself is a standard zip file which contains multiple files that make up the workbook
// You can set a zip archive comment if you want... Excel will ignore it, but a zip viewer would show it
var ZipComment = '';


// Ok, here goes! You probably shouldn't change anything below this...
var FSO = new ActiveXObject('Scripting.FileSystemObject');

function ParseCSV(csvString) {
  var csvData = [];

  // add a blank row at the top so that if the first cell is quoted, it'll be parsed properly
  csvString = CSV_ln + csvString;

  for (var i = 0, row = 0, col = 0; i < csvString.length; ++ i) {
    if (!csvData[row]) csvData[row] = [];
    if (!csvData[row][col]) csvData[row][col] = '';

    var is_ln = csvString.substr(i, CSV_ln.length) == CSV_ln;
    var is_delim = csvString.substr(i, CSV_delim.length) == CSV_delim;
    if (is_ln || is_delim) {
      if (is_ln) {
        i += CSV_ln.length - 1;
        ++ row;
        col = 0;
      } else {
        i += CSV_delim.length - 1;
        ++ col;
      }

      if (csvString.substr(i + 1, CSV_quote.length) == CSV_quote) {
        // attempt to parse what follows as a string
        // string terminates with the first unquoted quote (or the end of input)
        // and may only be followed by a row/column delimiter (or the end of input)
        for (var s = '', j = i + CSV_quote.length + 1; j < csvString.length; ++ j) {
          if (csvString.substr(j, CSV_escaped_quote.length) == CSV_escaped_quote) {
            j += CSV_escaped_quote.length - 1;
            s += CSV_quote;
          } else if (j + 1 >= csvString.length || csvString.substr(j, CSV_quote.length) == CSV_quote) {
            j += CSV_quote.length - 1;
            is_ln = csvString.substr(j + 1, CSV_ln.length) == CSV_ln;
            is_delim = csvString.substr(j + 1, CSV_delim.length) == CSV_delim;
            if (j + 2 >= csvString.length || is_ln || is_delim) {
              // string terminates properly, so store its value and continue parsing from its end
              if (!csvData[row]) csvData[row] = [];
              csvData[row][col] = s;
              i = j;
            } else {
              // unable to parse string properly, so re-parse as plain text
              i = i;
            }
            j = csvString.length;
          } else {
            s += csvString.charAt(j);
          }
        }
      }

    } else {
      csvData[row][col] += csvString.charAt(i);
    }
  }

  // remove the blank row that was added
  csvData.shift();

  return csvData;
}

// DOS has a really fun bit-packed datetime struct...
function DosDate(date) {
  date = new Date(String(date));
  var t = Math.floor(date.getSeconds() / 2) | date.getMinutes() << 5 | date.getHours() << 11;
  var d = date.getDate() | (date.getMonth() + 1) << 5 | (date.getFullYear() - 1980) << 9;
  return toInt32(d << 16 | t);
}

// CRC-32 for Zip -- thanks to http://mdfs.net/Info/Comp/Comms/CRC32.htm & Wikipedia
var CRC_poly = 0xedb88320, CRC_table = [];
for (var i = 0; i < 256; ++ i) {
  for (var crc = i, b = 0; b < 8; ++ b)
    crc = crc & 1 ? crc >>> 1 ^ CRC_poly : crc >>> 1;
  CRC_table[i] = crc;
}

function CRC32(data) {
  for (var crc = -1, i = 0; i < data.length; ++ i)
    crc = CRC_table[crc & 0xff ^ data.charCodeAt(i)] ^ crc >>> 8;
  return toInt32(crc ^ -1);
}

function toInt32(n) {
  return String.fromCharCode(n & 0xff) + String.fromCharCode(n >> 8 & 0xff)
    + String.fromCharCode(n >> 16 & 0xff) + String.fromCharCode(n >> 24 & 0xff);
}

function MakeZipFile(localName, file) {
  var isFolder = typeof file == 'undefined', lastModified = new Date;
  localName = localName.replace(/\\/g, '/').replace(/^\/+|\/+(?=\/)|\/+$/g, '');

  var size, namesize;
  if (isFolder) {
    localName += '/';
  } else {
    // convert from UTF-8 to corresponding 8-bit character array
    file = unescape(encodeURIComponent(file));
    
    size = toInt32(file.length);
  }

  namesize = toInt32(localName.length).slice(0, 2);
  var data = 'PK\x03\x04'                               // 0-3: Signature
           + (isFolder ? '\x14\0' : '\x0a\0')           // 4-5: Version needed
           + '\0\0'                                     // 6-7: Flags
           + '\0\0'                                     // 8-9: Compression method
           + DosDate(lastModified)                      // 10-13: File modification time & date
           + (isFolder ? '\0\0\0\0' : CRC32(file))      // 14-17: CRC-32
           + (isFolder ? '\0\0\0\0' : size)             // 18-21: Compressed size
           + (isFolder ? '\0\0\0\0' : size)             // 22-25: Uncompressed size
           + namesize                                   // 26-27: Filename length
           + '\0\0'                                     // 28-29: Extra field length
           + localName                                  // 30: Filename
           + ''                                         // Extra field
           + (isFolder ? '' : file);                    // File contents
  return data;
}

function MakeZipIndex(offset, zipFile) {
  var isFolder = zipFile.charAt(4) == 0x14;
  var data = 'PK\x01\x02'                               // Signature
           + '\x3f\0'                                   // Version made by
           + zipFile.substr(4, 24)                      // Version needed ... Filename length
           + '\0\0'                                     // Extra data length
           + '\0\0'                                     // File comment length
           + '\0\0'                                     // Disk # Start
           + '\0\0'                                     // Internal attributes
           + (isFolder ? '\x10' : '\x20') + '\x20\0\0'  // External attributes
           + toInt32(offset)
           + zipFile.substr(30, zipFile.charCodeAt(27) << 8 | zipFile.charCodeAt(26)) // Filename
           + ''                                         // Extra data
           + '';                                        // File comment
  return data;
}

function MakeZipIndexEnd(offset, files, zipIndex) {
  files = String.fromCharCode(files & 0xff) + String.fromCharCode(files >> 8 & 0xff);
  var commentLen = ZipComment.length, cdSize = zipIndex.length;
  var data = 'PK\x05\x06'                               // Signature
           + '\0\0'                                     // Disk number
           + '\0\0'                                     // Disk # w/start of central directory header
           + files                                      // Disk entries
           + files                                      // Total entries
           + toInt32(cdSize)                            // Central directory size
           + toInt32(offset)                            // Start offset of central directory
           + toInt32(commentLen).slice(0, 2)            // Comment length
           + ZipComment;                                // Zip file comment
  return data;
}

// This is necessary because of character set shenanigans. Ain't character set shenanigans fun?
var translate = {0x80: '\xc7', 0x81: '\xfc', 0x82: '\xe9', 0x83: '\xe2', 0x84: '\xe4', 0x85: '\xe0', 0x86: '\xe5', 0x87: '\xe7', 0x88: '\xea', 0x89: '\xeb', 0x90: '\xc9', 0x91: '\xe6', 0x92: '\xc6', 0x93: '\xf4', 0x94: '\xf6', 0x95: '\xf2', 0x96: '\xfb', 0x97: '\xf9', 0x98: '\xff', 0x99: '\xd6', 0x8a: '\xe8', 0x8b: '\xef', 0x8c: '\xee', 0x8d: '\xec', 0x8e: '\xc4', 0x8f: '\xc5', 0x9a: '\xdc', 0x9b: '\xa2', 0x9c: '\xa3', 0x9d: '\xa5', 0x9e: '\u20a7', 0x9f: '\u0192', 0xa0: '\xe1', 0xa1: '\xed', 0xa2: '\xf3', 0xa3: '\xfa', 0xa4: '\xf1', 0xa5: '\xd1', 0xa6: '\xaa', 0xa7: '\xba', 0xa8: '\xbf', 0xa9: '\u2310', 0xaa: '\xac', 0xab: '\xbd', 0xac: '\xbc', 0xad: '\xa1', 0xae: '\xab', 0xaf: '\xbb', 0xb0: '\u2591', 0xb1: '\u2592', 0xb2: '\u2593', 0xb3: '\u2502', 0xb4: '\u2524', 0xb5: '\u2561', 0xb6: '\u2562', 0xb7: '\u2556', 0xb8: '\u2555', 0xb9: '\u2563', 0xba: '\u2551', 0xbb: '\u2557', 0xbc: '\u255d', 0xbd: '\u255c', 0xbe: '\u255b', 0xbf: '\u2510', 0xc0: '\u2514', 0xc1: '\u2534', 0xc2: '\u252c', 0xc3: '\u251c', 0xc4: '\u2500', 0xc5: '\u253c', 0xc6: '\u255e', 0xc7: '\u255f', 0xc8: '\u255a', 0xc9: '\u2554', 0xca: '\u2569', 0xcb: '\u2566', 0xcc: '\u2560', 0xcd: '\u2550', 0xce: '\u256c', 0xcf: '\u2567', 0xd0: '\u2568', 0xd1: '\u2564', 0xd2: '\u2565', 0xd3: '\u2559', 0xd4: '\u2558', 0xd5: '\u2552', 0xd6: '\u2553', 0xd7: '\u256b', 0xd8: '\u256a', 0xd9: '\u2518', 0xda: '\u250c', 0xdb: '\u2588', 0xdc: '\u2584', 0xdd: '\u258c', 0xde: '\u2590', 0xdf: '\u2580', 0xe0: '\u03b1', 0xe1: '\xdf', 0xe2: '\u0393', 0xe3: '\u03c0', 0xe4: '\u03a3', 0xe5: '\u03c3', 0xe6: '\xb5', 0xe7: '\u03c4', 0xe8: '\u03a6', 0xe9: '\u0398', 0xea: '\u03a9', 0xeb: '\u03b4', 0xec: '\u221e', 0xed: '\u03c6', 0xee: '\u03b5', 0xef: '\u2229', 0xf0: '\u2261', 0xf1: '\xb1', 0xf2: '\u2265', 0xf3: '\u2264', 0xf4: '\u2320', 0xf5: '\u2321', 0xf6: '\xf7', 0xf7: '\u2248', 0xf8: '\xb0', 0xf9: '\u2219', 0xfa: '\xb7', 0xfb: '\u221a', 0xfc: '\u207f', 0xfd: '\xb2', 0xfe: '\u25a0', 0xff: '\xa0'};
var reverse = {'\u20ac': 0x80, '\u201a': 0x82, '\u0192': 0x83, '\u201e': 0x84, '\u2026': 0x85, '\u2020': 0x86, '\u2021': 0x87, '\u02c6': 0x88, '\u2030': 0x89, '\u0160': 0x8a, '\u2039': 0x8b, '\u0152': 0x8c, '\u017d': 0x8e, '\u2018': 0x91, '\u2019': 0x92, '\u201c': 0x93, '\u201d': 0x94, '\u2022': 0x95, '\u2013': 0x96, '\u2014': 0x97, '\u02dc': 0x98, '\u2122': 0x99, '\u0161': 0x9a, '\u203a': 0x9b, '\u0153': 0x9c, '\u017e': 0x9e, '\u0178': 0x9f};
function ReadBinaryFile(filename) {
  var file = FSO.OpenTextFile(filename), data = '', textData = '';
  while (!file.AtEndOfStream) {
    var c = file.Read(1);
    data += reverse[c] ? String.fromCharCode(reverse[c]) : c;
    textData += c;
  }
  return [data, textData];
}
function WriteBinaryFile(filename, binaryData) {
  // It's possible that FSO.CreateTextFile using the TextStream.Write method would work...
  // That might also depend on the regional settings in Windows, so I'm kinda afraid to try it
  var stream = new ActiveXObject('ADODB.Stream');
  stream.Type = 2;
  stream.Charset = "437";
  stream.Open();
  for (var i = 0; i < binaryData.length; ++ i) {
    var c = binaryData.charCodeAt(i);
    stream.WriteText(translate[c] || String.fromCharCode(c));
  }
  stream.SaveToFile(filename);
  stream.Close();
}

function XMLEscape(string) {
  var XMLChars = {'"': '&quot;', '\'': '&apos;', '<': '&lt;', '>': '&gt;', '&': '&amp;'};
  return string.replace(/["'<>&]/g, function (c) {
    return XMLChars[c];
  });
}

// Actual start of the code
var success = false;
for (var i = 0; i < WScript.arguments.length; ++ i) {
  var file = WScript.Arguments(i), ext = file.split('.').pop().toLowerCase();

  var files = [];
  if (FSO.FolderExists(file)) {
    var folder = new Enumerator(FSO.GetFolder(file).files);
    for (; !folder.atEnd(); folder.moveNext()) {
      file = folder.item();
      if (String(file).split('.').pop().toLowerCase() == CSV_ext.toLowerCase()) {
        files.push(String(file));
      }
    }
  } else {
    files.push(file);
  }

  // these files will be identical for every workbook, so they're defined outside the loop
  // for speed and efficiency, files are stored and zipped in memory, rather than on disk

  // .\[Content_Types].xml
  var ContentTypes = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"><Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/><Default Extension="xml" ContentType="application/xml"/><Override PartName="/xl/workbook.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"/><Override PartName="/xl/worksheets/sheet1.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml"/><Override PartName="/xl/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/><Override PartName="/xl/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.styles+xml"/><Override PartName="/xl/sharedStrings.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sharedStrings+xml"/></Types>';

  // .\_rels\.rels
  var rels = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="xl/workbook.xml"/></Relationships>';

  // .\xl\styles.xml
  var styles = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"><fonts count="1"><font></font></fonts><fills count="1"><fill></fill></fills><borders count="1"><border></border></borders><cellStyleXfs count="1"><xf numFmtId="0" fontId="0" fillId="0" borderId="0"/></cellStyleXfs><cellXfs count="2"><xf numFmtId="49" fontId="0" fillId="0" borderId="0" xfId="0" quotePrefix="1"/><xf numFmtId="49" fontId="0" fillId="0" borderId="0" xfId="0" quotePrefix="1" applyNumberFormat="1" applyAlignment="1"><alignment wrapText="1"/></xf></cellXfs><cellStyles count="1"><cellStyle name="Normal" xfId="0" builtinId="0"/></cellStyles><dxfs count="0"/><tableStyles count="0" defaultTableStyle="TableStyleMedium2" defaultPivotStyle="PivotStyleMedium9"/></styleSheet>';

  // .\xl\_rels\workbook.xml.rels
  var workbook_rels = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet1.xml"/><Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings" Target="sharedStrings.xml"/><Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/><Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml"/></Relationships>';

  // .\xl\theme\theme1.xml
  var theme = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><a:theme xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" name="Office Theme"><a:themeElements><a:clrScheme name="Office"><a:dk1><a:sysClr val="windowText" lastClr="000000"/></a:dk1><a:lt1><a:sysClr val="window" lastClr="FFFFFF"/></a:lt1><a:dk2><a:srgbClr val="1F497D"/></a:dk2><a:lt2><a:srgbClr val="EEECE1"/></a:lt2><a:accent1><a:srgbClr val="4F81BD"/></a:accent1><a:accent2><a:srgbClr val="C0504D"/></a:accent2><a:accent3><a:srgbClr val="9BBB59"/></a:accent3><a:accent4><a:srgbClr val="8064A2"/></a:accent4><a:accent5><a:srgbClr val="4BACC6"/></a:accent5><a:accent6><a:srgbClr val="F79646"/></a:accent6><a:hlink><a:srgbClr val="0000FF"/></a:hlink><a:folHlink><a:srgbClr val="800080"/></a:folHlink></a:clrScheme><a:fontScheme name="Office"><a:majorFont><a:latin typeface="Cambria"/><a:ea typeface=""/><a:cs typeface=""/></a:majorFont><a:minorFont><a:latin typeface="Calibri"/><a:ea typeface=""/><a:cs typeface=""/></a:minorFont></a:fontScheme><a:fmtScheme name="Office"><a:fillStyleLst><a:solidFill><a:schemeClr val="phClr"/></a:solidFill><a:gradFill rotWithShape="1"><a:gsLst><a:gs pos="0"><a:schemeClr val="phClr"><a:tint val="50000"/><a:satMod val="300000"/></a:schemeClr></a:gs><a:gs pos="35000"><a:schemeClr val="phClr"><a:tint val="37000"/><a:satMod val="300000"/></a:schemeClr></a:gs><a:gs pos="100000"><a:schemeClr val="phClr"><a:tint val="15000"/><a:satMod val="350000"/></a:schemeClr></a:gs></a:gsLst><a:lin ang="16200000" scaled="1"/></a:gradFill><a:gradFill rotWithShape="1"><a:gsLst><a:gs pos="0"><a:schemeClr val="phClr"><a:shade val="51000"/><a:satMod val="130000"/></a:schemeClr></a:gs><a:gs pos="80000"><a:schemeClr val="phClr"><a:shade val="93000"/><a:satMod val="130000"/></a:schemeClr></a:gs><a:gs pos="100000"><a:schemeClr val="phClr"><a:shade val="94000"/><a:satMod val="135000"/></a:schemeClr></a:gs></a:gsLst><a:lin ang="16200000" scaled="0"/></a:gradFill></a:fillStyleLst><a:lnStyleLst><a:ln w="9525" cap="flat" cmpd="sng" algn="ctr"><a:solidFill><a:schemeClr val="phClr"><a:shade val="95000"/><a:satMod val="105000"/></a:schemeClr></a:solidFill><a:prstDash val="solid"/></a:ln><a:ln w="25400" cap="flat" cmpd="sng" algn="ctr"><a:solidFill><a:schemeClr val="phClr"/></a:solidFill><a:prstDash val="solid"/></a:ln><a:ln w="38100" cap="flat" cmpd="sng" algn="ctr"><a:solidFill><a:schemeClr val="phClr"/></a:solidFill><a:prstDash val="solid"/></a:ln></a:lnStyleLst><a:effectStyleLst><a:effectStyle><a:effectLst><a:outerShdw blurRad="40000" dist="20000" dir="5400000" rotWithShape="0"><a:srgbClr val="000000"><a:alpha val="38000"/></a:srgbClr></a:outerShdw></a:effectLst></a:effectStyle><a:effectStyle><a:effectLst><a:outerShdw blurRad="40000" dist="23000" dir="5400000" rotWithShape="0"><a:srgbClr val="000000"><a:alpha val="35000"/></a:srgbClr></a:outerShdw></a:effectLst></a:effectStyle><a:effectStyle><a:effectLst><a:outerShdw blurRad="40000" dist="23000" dir="5400000" rotWithShape="0"><a:srgbClr val="000000"><a:alpha val="35000"/></a:srgbClr></a:outerShdw></a:effectLst><a:scene3d><a:camera prst="orthographicFront"><a:rot lat="0" lon="0" rev="0"/></a:camera><a:lightRig rig="threePt" dir="t"><a:rot lat="0" lon="0" rev="1200000"/></a:lightRig></a:scene3d><a:sp3d><a:bevelT w="63500" h="25400"/></a:sp3d></a:effectStyle></a:effectStyleLst><a:bgFillStyleLst><a:solidFill><a:schemeClr val="phClr"/></a:solidFill><a:gradFill rotWithShape="1"><a:gsLst><a:gs pos="0"><a:schemeClr val="phClr"><a:tint val="40000"/><a:satMod val="350000"/></a:schemeClr></a:gs><a:gs pos="40000"><a:schemeClr val="phClr"><a:tint val="45000"/><a:shade val="99000"/><a:satMod val="350000"/></a:schemeClr></a:gs><a:gs pos="100000"><a:schemeClr val="phClr"><a:shade val="20000"/><a:satMod val="255000"/></a:schemeClr></a:gs></a:gsLst><a:path path="circle"><a:fillToRect l="50000" t="-80000" r="50000" b="180000"/></a:path></a:gradFill><a:gradFill rotWithShape="1"><a:gsLst><a:gs pos="0"><a:schemeClr val="phClr"><a:tint val="80000"/><a:satMod val="300000"/></a:schemeClr></a:gs><a:gs pos="100000"><a:schemeClr val="phClr"><a:shade val="30000"/><a:satMod val="200000"/></a:schemeClr></a:gs></a:gsLst><a:path path="circle"><a:fillToRect l="50000" t="50000" r="50000" b="50000"/></a:path></a:gradFill></a:bgFillStyleLst></a:fmtScheme></a:themeElements><a:objectDefaults/><a:extraClrSchemeLst/></a:theme>';

  // process each CSV file
  for (var j = 0; j < files.length; ++ j) {
    file = files[j];
    ext = file.split('.').pop().toLowerCase();

    var xlsxFile = file.replace(/\.[^\.]*$/, '.xlsx');
    if (FSO.FileExists(file) && ext == CSV_ext.toLowerCase() && !FSO.FileExists(xlsxFile) && !FSO.FolderExists(xlsxFile)) {
      file = FSO.GetFile(file);

      var contents = ReadBinaryFile(String(file));
      var csvData = contents[0], textData = contents[1];
      try {
        csvData = decodeURIComponent(escape(csvData));
      } catch (e) {
        // ANSI file
        csvData = textData;
      }
      
      csvData = csvData.replace(/\r?\n|\r/g, '\r\n'); // convert line endings to Windows-based '\r\n'
      csvData = ParseCSV(csvData);
      
      // .\xl\workbook.xml
      var workbook = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"><sheets><sheet name="' + XMLEscape(file.name.replace(/\.[^\.]*/, '')) + '" sheetId="1" r:id="rId1"/></sheets></workbook>';

      // .\xl\worksheets\sheet1.xml
      var worksheet = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"><sheetData>';
      var strings = [], strings_map = {};
      for (var row = 0; row < csvData.length; ++ row) {
        var row_id = String(row + 1);

        worksheet += '<row r="' + row_id + '">';
        for (var col = 0; col < csvData[row].length; ++ col) {
          var value = csvData[row][col];
          if (value) {
            var col_id;
            if (col >= 27 * 26) {
              col_id = String.fromCharCode(65 + Math.floor((col - 27 * 26) / 26 / 26))
                     + String.fromCharCode(65 + Math.floor((col - 27 * 26) / 26) % 26)
                     + String.fromCharCode(65 + col % 26);
            } else if (col >= 26) {
              col_id = String.fromCharCode(65 + Math.floor((col - 26) / 26))
                     + String.fromCharCode(65 + (col - 26) % 26);
            } else {
              col_id = String.fromCharCode(65 + col % 26);
            }

            var string_id;
            if (value in strings_map) {
              string_id = strings_map[value];
            } else {
              string_id = strings.length;
              strings_map[value] = string_id;
              strings.push(value);
            }

            var style_id = value.indexOf('\r\n') ? 1 : 0;
            worksheet += '<c r="' + col_id + row_id + '" s="' + style_id + '" t="s"><v>' + string_id + '</v></c>';
          }
        }
        worksheet += '</row>';
      }
      worksheet += '</sheetData></worksheet>';

      // .\xl\sharedStrings.xml
      var sharedStrings = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">';
      for (var string_id = 0; string_id < strings.length; ++ string_id)
        sharedStrings += '<si><t>' + XMLEscape(strings[string_id]) + '</t></si>';
      sharedStrings += '</sst>';

      // create uncompressed zip file containing all of the necessary files
      var zipFile = '', zipIndex = '', zipFiles = [];
      zipFiles.push(MakeZipFile('xl/'));
      zipFiles.push(MakeZipFile('xl/sharedStrings.xml', sharedStrings));
      zipFiles.push(MakeZipFile('xl/styles.xml', styles));
      zipFiles.push(MakeZipFile('xl/theme/'));
      zipFiles.push(MakeZipFile('xl/theme/theme1.xml', theme));
      zipFiles.push(MakeZipFile('xl/workbook.xml', workbook));
      zipFiles.push(MakeZipFile('xl/worksheets/'));
      zipFiles.push(MakeZipFile('xl/worksheets/sheet1.xml', worksheet));
      zipFiles.push(MakeZipFile('xl/_rels/'));
      zipFiles.push(MakeZipFile('xl/_rels/workbook.xml.rels', workbook_rels));
      zipFiles.push(MakeZipFile('[Content_Types].xml', ContentTypes));
      zipFiles.push(MakeZipFile('_rels/'));
      zipFiles.push(MakeZipFile('_rels/.rels', rels));

      // create the zip index, which goes at the end of the file
      for (var i = 0, offset; i < zipFiles.length; ++ i) {
        offset = zipFile.length;
        zipFile += zipFiles[i];
        zipIndex += MakeZipIndex(offset, zipFiles[i]);
      }
      zipIndex += MakeZipIndexEnd(zipFile.length, zipFiles.length, zipIndex);
      WriteBinaryFile(xlsxFile, zipFile + zipIndex);
      success = true;
    }
  }
}

if (!success) {
  WScript.Echo('No files were processed.\n\nDrag files or folders onto this script to process them.');
}

To run it programmatically, use wscript csvtoxlsx.js, followed by each CSV file or folder as a separate argument.

edit: fixed UTF-8
edit²: fixed cell wrap text alignment for multi-line cells and added automatic line ending conversion

heterodox

@anotherusername said in Excel is fundamentally broken.:

I am probably certifiably insane now

*stares* You certainly are. Well done.

HardwareGeek

@anotherusername said in Excel is fundamentally broken.:

I am probably certifiably insane now,

You've been here 4 years and made over 14k posts. You're way past "probably."

Filed under: We all are.

Lorne Kates

With all the time and effort you're going to put into this, it'll just be easier to print the dataset, and hire minimum-wage data entry clerk to re-type it directly into an Excel doc.

AyGeePlus

@anotherusername

Wait, why did you do this? What can read .xlsx that can't read .csv?

@dfdub said in Excel is fundamentally broken.:

@cabbage Holy shit, TIL. Does it at least not do that in protected mode?

Based on what I read it throws up a warning like 'oh it's launching a program' and 'oh it's contacting an outside source' if run from protected mode. Given that there are probably thousands of business-critical excel worksheets that use these webhooks, it would be pretty easy to distribute something that required those boxes to be clicked to work.

anotherusername

@aygeeplus said in Excel is fundamentally broken.:

Wait, why did you do this? What can read .xlsx that can't read .csv?

Excel. Haven't you been paying attention?

The key thing about the script is that it treats every cell like text. No parsing numbers into floats that lose precision... no parsing anything that looks halfway remotely like a date format into a date. Every cell that it creates in the .xlsx is tagged as a 'text cell.

Gąska

@anotherusername the only complaint I have is having to specify a single line ending. Come on, it's not that hard to support both. [\r\n]+, or \r?\n if you care about empty lines but not MacOS Classic.

AyGeePlus

@anotherusername right, but Excel strips the text-specifying single quote on save.

But at least if you set it read-only you can look at it. Ok, I'm caught up.

anotherusername

@aygeeplus said in Excel is fundamentally broken.:

@anotherusername right, but Excel strips the text-specifying single quote on save.

When saving as a CSV, yes. But when saving as XLSX, no... that cell style is preserved. So once my script has converted it, there should be no more issues with Excel changing stuff. You get the text with no modification.

@gąska said in Excel is fundamentally broken.:

@anotherusername the only complaint I have is having to specify a single line ending. Come on, it's not that hard to support both. [\r\n]+, or \r?\n if you care about empty lines but not MacOS Classic.

I can fix this, but two issues come into play. One, my CSV parser needs to know the length of the thing it's matching here, so that after it matches, it can skip past it. Changing the literal to a regexp would make that slightly more complicated; it'd need to capture the match, and check the length of the captured value. And two, Excel itself (at least for Windows; I don't know about other systems) uses \r\n for embedded newlines in cell contents (which my CSV parser does allow, if they're inside a quoted string... although now that I've tested it, Excel needs a cell style that has applyAlignment and wrapText for cells with multi-line content, so I'll fix that too).

As a result, I think the best option is to automatically convert all line endings to \r\n before parsing, and leave the row delimiter default as '\r\n'. So I've edited it to do that.