Home ยป Forum ยป Author Hangout

Forum: Author Hangout

Save as Filtered HTML

Switch Blayde ๐Ÿšซ

In another thread, Lazeez said the SOL Wizard using .docx input is crap and to use Filtered HTML when submitting. I've never done that.

I tried it and got a message that basically said that it would remove Office-specific tags and that some Office features may not be available when reopening.

Reopening what? The .docx file that I saved as HTML? Or the HTML file outputted? And what are Office-specific tags that might be lost? And what kinds of features might be lost?

CB ๐Ÿšซ
Updated:

@Switch Blayde

A few weeks ago Lazeez mentioned why exactly the .docx input might be crap. one reason was that the overlapping formatting could cause issues with the display on SOL. I'd seen this on a few of my chapters so I decided to start submitting the filtered file. I only create the HTML file for submitting and keep the .docx file for working or future editing. So, when I create the HTM files I still have the original .docx to fall back on. Don't you?

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@CB

I only create the HTML file for submitting and keep the Docx file for working or future editting.

Aha, that's the answer to my question. Nothing happens to the .docx file. They're talking about the generated .html file. It confused me when they said, "when you reopen it."

Thanks.

Replies:   CB
CB ๐Ÿšซ
Updated:

@Switch Blayde

Sorry, my reply was delayed and later edited as supper was ready. Yes, the original is saved if you create the HTML file with "save as".

I "Save" followed by "Save as" to create the other format.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@CB

Yes, the original is saved if you create the HTML file with "save as".

I don't know what you mean by this.

Replies:   Dinsdale
Dinsdale ๐Ÿšซ

@Switch Blayde

My interpretation would be

Yes, the original is preserved if you create the HTML file with "save as".

Which is what one would expect.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Dinsdale

Which is what one would expect.

Yes, that is what I would expect. The "save as" shouldn't affect the original file. It just creates a new file with a different name and/or extension. Save this file as that file.

It was Word's wording that made me doubt that it would do it the way it's supposed to do it.

Ernest Bywater ๐Ÿšซ

@Switch Blayde

Most word processor programs assign the basic format code at the paragraph for each paragraph while HTML assigns it at the document level. In many cases the format code is also assigned at the word or character level within a word processor program. MS Word is much worse at providing this information than any other word processor program, but all of them do it.

When you save as filtered HTML the bulk of this excess coding is eliminated from the saved HTML code. However, if you later open the HTML file in MS Word it will have issues reading the file because it wants and expects to see all of that skimmed off excess code.

Another way to think of it is the DOCX file has the format information with each paragraph that sets out the font type, font size, paragraph indentation, paragraph alignment, and often the font colour while the filtered HTML will have all of that code a single time at the start of the document. Thus a DOCX file with 1,000 paragraphs will have 1,000 instances of all of that code while a filtered HTML file with 1,000 paragraphs will have only 1 instance of that code.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Ernest Bywater

if you later open the HTML file in MS Word it will have issues reading the file because it wants and expects to see all of that skimmed off excess code.

I see

Vincent Berg ๐Ÿšซ

@Switch Blayde

Both Ernest and I have gone into the 'crap' that MS dumps into WORD, which we've always stripped out simply because it significantly impacts how effective the html is (i.e. if the processor is continually checking invalid codes, it's NOT process the codes which ARE meaningful nearly as quickly.

But the things that WORD does as html entires are for internal housekeeping only, like names, place names, numerics and other details like that which is ONLY there so that WORD can recreate the original WORD file from the html file.

Saving the file as "Filtered html" is the most effective method of stripping those details out, though it still leaves a LOT of crap behind (ex: < span> commands around internal formatting such as italicizing and emphasis commands: "< span style='font-family:"Garamond",serif'>"). Since that and the other commands doesn't DO anything, there's really no point to the commands, but when the page loads, the html code has to run through ALL the various commands to determine whether it applies in some way (the Span command is a handy way of saying 'pardon me, but I've got my own reason for burying this code, so please, don't worry about what it is) and is most often used because Apple has it's own formatting rules, and needs a duplicate "Centering" rule, because it'll ignore ANY center command that doesn't have the duel coding.

SOL, on the other hand, treats all html text as text only, only recognizing a few html commands, specifically in-line formatting < i> < b> and < u>, < a hlink> and < img commands>, as it strips out ALL predefined paragraph Styles (such as first line indenting, or lack thereof, or other specifically formatted paragraph types (such as indented text).

For those of us who also submit our stories to other sites for publishing, this is a real pain in the ass, as we've got to keep separate versions of each variant files of each chapter, section or book. Luckily, SOL does NOT choke when it encounter the various paragraph types, it simply ignores the various paragraph styles, so they essentially 'pass through' without negatively impacting anything.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ
Updated:

@Vincent Berg

only recognizing a few html commands, specifically in-line formatting < i> < b> and < u>, < a hlink> and < img commands>

I thought SOL recognized blockquote. I was actually going to use it in my current short story that I will save as filtered html, but couldn't figure out how to do it since Word doesn't have a blockquote. When I do my ebooks, I handle blockquote in Word with left and right margin indenting. But I don't think save as filtered html would convert that to an HTML blockquote so I didn't use the blockquote.

ETA:

I wondered what if I coded in Word a < blockquote> and < /blockquote> around my paragraphs that are in the blockquote and then saved it as filtered html. I wondered if Word would leave it there and then SOL would recognize it. But without a previewer, I decided not to try.

Lazeez Jiddan (Webmaster)

@Switch Blayde

I wondered what if I coded in Word a < blockquote> and < /blockquote> around my paragraphs that are in the blockquote and then saved it as filtered html. I wondered if Word would leave it there and then SOL would recognize it. But without a previewer, I decided not to try.

It should work.

Vincent Berg ๐Ÿšซ

@Switch Blayde

I thought SOL recognized blockquote.

It does, but my point was that it's extra steps that run contrary to my other publications (i.e. I have to remember each chapter where I have indented text, or create a separate set of files for posting to SOL (which is what I do)).

Also, the formatting of block quotes on SOL is different than html supports, which often differs from how books traditionally format indented quotes.

Lazeez Jiddan (Webmaster)
Updated:

@Vincent Berg

block quotes on SOL is different than html supports

While the official documentation is for {block}{/block}, so many authors make the mistake that I added support for { blockquote} { /blockquote}, and the html->tags converter understands and supports < blockquote>.

The html converter tries to support the 'centre' CSS, but not indentation/margins.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

While the official documentation is for {block}{/block}, so many authors make the mistake that I added support for

, and the html->tags converter understands and supports < blockquote>.

I'm confused. I thought you said if I had < blockquote> with a < /blockquote> in my document that I save as filtered HTML, it should work and give me an indented blockquote on SOL.

What is {block}{/block}

Replies:   Michael Loucks
Michael Loucks ๐Ÿšซ
Updated:

@Switch Blayde

What is {block}{/block}

See: SOL Text Formatting Guide

Block Quoted text:

Usage should be restricted to small blocks of text that need to be distinguished from the main body. Example for use is a note, letter or a flashback in the story.

The {block} {/block} tags cause the text to be indented and italicized with a vertical rule on its left. It can enclose many paragraphs, so it needs a closing tag and must be on a line of its own, separated from the text by double returns on either side.

Replies:   Switch Blayde
Switch Blayde ๐Ÿšซ

@Michael Loucks

Block Quoted text:

Ok, but the < blockquote> imbedded in the "saved as filtered html" output should do the same thing (without the vertical line and forced italics).

What confused me was the "so many authors make the mistake that I added support for the html->tags converter understands and supports < blockquote>"

That implies I cannot use the < blockquote> (an html tag). But earlier Lazeez said it should work.

Lazeez Jiddan (Webmaster)

@Switch Blayde

What confused me was the "so many authors make the mistake that I added support for the html->tags converter understands and supports < blockquote>"

Oops. I should have reviewed my post. I added support for { blockquote}{ /blockquote} tag. the html converter has always been able to see and deal with the < blockquote> html tag.

Back to Top

Close
 

WARNING! ADULT CONTENT...

Storiesonline is for adult entertainment only. By accessing this site you declare that you are of legal age and that you agree with our Terms of Service and Privacy Policy.