Home ยป Forum ยป Author Hangout

Forum: Author Hangout

I hate HTML coding

StarFleet Carl ๐Ÿšซ

I thought the whole purpose was to end up with WYSIWYG - what you see is what you get. I type in Libre Office, I don't do anything stupid and weird, and when I tell it to save as an HTML file, it converts things. Maybe with more code than some of you guys like, but it's still fine. That way, when I load the HTML document up into Sigil to create the EPUB, all I have to do is scroll through and make sure things are fine.

Chapter One - yeah, that's okay. Halfway through Chapter Two - why did you just randomly change font color and start adding all sorts of extra crap, such that it took me over an hour to get it cleaned up? Chapter Three, you made it until the last three pages, then did the SAME THING! QUITE annoying to have to open a text window, copy paste the paragraphs from my ODT file into the text window with the appropriate HTML already in place, then copy and replace, one paragraph at a time, 22 pages.

And then the rest of the book was fine - no extra code, no random color changes, life is good. What the hell?

Okay, venting over now. Letting Flight Crew run for a while to see what it comes up with.

Keet ๐Ÿšซ

@StarFleet Carl

You probably have something in your ODT file that causes the HTML export to include that crap.
Check this LO extension: https://extensions.libreoffice.org/en/extensions/show/clean-and-validate-for-publishing-with-pagination which specifically checks for such things that disrupt a clean export.
Other extensions for publishing might help too or do an even better job.

Replies:   StarFleet Carl
StarFleet Carl ๐Ÿšซ

@Keet

You probably have something in your ODT file that causes the HTML export to include that crap.

I've downloaded it, and I'm using it to effectively do what Laz suggested. I take my file, then use that to strip everything EXCEPT italics and special characters out, and then put the appropriate HTML headings and fonts BACK in - and lo, I'm getting clean files.

Ernest Bywater ๐Ÿšซ

@StarFleet Carl

First thing to know is that Libre Office is a word processing program while html is a text language - they're like north and west - not direct opposites, but they don't meet too well all over.

The second thing is when you write in any word processing program you should establish styles for every type of paragraph you intend to use then use them at all times.

.................

now to explain the conflict:

In word processing you establish the general format parameters for the text of each paragraph and then just enter the few changes you want for a few individual characters or words in the text of that paragraph. The most common changes being the use of bold and italics. Establishing the paragraph formats is best done via a style designation.

In basic html you establish the general format parameters for the text of the whole document and then just enter the few changes you want for a few individual characters or words in the text. The most common changes being the use of bold and italics.

With advanced html you use things like a CSS which is a style sheet where you establish styles for paragraphs, or groups of text, or characters then you call on those styles within the html code. This is the best way to set up the html to create an e-pub from.

An e-pub requires that the chapters and sub-chapters have h1 and h2 codes with appropriate style code with the headings or in the CSS.

..................

I write my stories using Libre Office then because I need multiple html files for the finished story I save as html, run a script to clean out most of the excess code to get my baseline html file on a copy of which I run a second script to have a finished file for submission to SoL, and then I run a third script on the original baseline html file to create my website html file into which i add a CSS for an inverted colour version for personal use, save, then replace the CSS with another that has a different set of colours for my personal use as well, and last is to remove the html index in the file, add the html for the 'other story info at the end' then save as the html to make the e-pub from. I really only need two html versions for SoL and the e-pub, but I make the other two for my own use - one for the pc and one for the tablet.

Replies:   Keet  StarFleet Carl
Keet ๐Ÿšซ

@Ernest Bywater

but I make the other two for my own use - one for the pc and one for the tablet.

There's no need to make two versions just because the devices have different screen sizes. Look up the CSS @media rules (https://www.w3schools.com/cssref/css3_pr_mediaquery.asp) where you can set different rules for different screen sizes. Example:
@media screen and (max-width: 600px) {
/*phones*/
--your rules--
}
@media screen and (min-width: 600px) {
/*tablets in portrait*/
--your rules--
}
@media only screen and (min-width: 768px) {
/*tablets in landscape, laptops, desktops*/
--your rules--
}
@media only screen and (min-width: 1024px) {
/*tablets in landscape, laptops, desktops*/
--your rules--
}
It even catches resizing on larger screens so that if you make the browser window on a desktop smaller it adjusts to the rules for the smaller width.
You also don't need different versions for a 'normal' and dark theme. It requires a little JavaScript for switching but the same html can then use the selected CSS for a theme.
You can check how it works on the ReaderInfo site where both are used.

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@Keet

The reason I have two versions are not technical - it's to do with my eyesight. The HTML code I use allows whatever system it's on to adjust the display to the system size, colour, and fonts. However, I prefer to read on the PC with one colour setup while on the tablet I prefer to use the inverse colour setup.

The only differences between the CSS for both versions is the choices of colour for each style. The actual code after that is exactly the same.

The SoL version is different as it has a very different CSS and the e-pub code is due to the change of some of the content.

StarFleet Carl ๐Ÿšซ

@Ernest Bywater

The second thing is when you write in any word processing program you should establish styles for every type of paragraph you intend to use then use them at all times.

I know we've had part of this discussion before. I think the light is FINALLY dawning on me on how to set things up in the first place - which is what I haven't been doing - instead of expecting the software to work miracles and fix things for me when I'm done.

We'll see. The funny thing is, I know basic and some advanced coding for HTML for creating my own websites - I've done that in the past. It's this whole converting a document over that still screws me up. When you talk about writing your own scripts, that's when I go WTF?

Also, the code I was having inserted, purely at random, it seemed, was (/span). I know not to put the left and right arrows in here. The first couple of chapters were fine, then things suddenly went to hell, then things were fine, later.

I'm now 'cheating', going through each file again and gutting / changing things, then adding the individual chapters to the Sigil EPUB. It's scary, but it's actually working fairly easily, too. (That's the part that bothers me!)

Replies:   Ernest Bywater
Ernest Bywater ๐Ÿšซ

@StarFleet Carl

When you talk about writing your own scripts, that's when I go WTF?

I use Manjaro Linux and find it quicker and easier to do some thing using a script to do the work.

Not: in the code mentioned below spaces are used to stop it being run as code within the message (I hope).

However, what the basic script does is to clean out all of the font format, margin format, and language format code as well as replace the hexadecimal colour numbers with the colour names. Examples of what is removed are: margin-bottom: 0cm; letter-spacing: normal; font-style: normal; align="justify - etc.

The other two scripts convert the standard html code into what I need to make the CSS work better. An example for the SoL script is to change the H1 to H3 and add the red colour for the text to have a red chapter heading - the actual conversion if from < h1 > to < h3 class = " c " > < span class = " red " > and the closing code for the H1 is changed to the closing for this code.

..............

The CSS works by applying a style using the code class=" xxx " and when applied to anything within a paragraph it is done as < span class = " xxx " > then closed with the < / span > command (this is the closure tab for all span commands).

When most software is converting to the xhtml used for making an epub it will create a CSS and assign 'classes' while also converting everything to suitable 'span' commands to apply the class. Thus where basic html has < b > < / b > to open and close bold for text there will be a CSS style with the format for the bold and use span commands. In my CSS I have, for bold,:

. bold {
font-weight: bold;
line-height: 1.0em;
}

then I call it with < span class = " bold " > and close it with < / span >.

...........

The down side of most software used to create an e-pub is that they will create there own classes for anything that does not already have a class assigned from the CSS. Thus if you have a paragraph simply as < p > the conversion software will assign it's own class and the will be changed to < p class = " yyy " > and the same happens for text format changes within the paragraph.

Thus your < b > and < i > commands will be assigned a class and converted to have that class applied with a < span = class " vvv " > command while the < / b > and < / i > will all be changed to < / span > and the same will happen with all of the font colour commands.

I have you email address, so if you want me to I can send you copies of what I use as the CSS and the script files. The scripts I run are for Linux but I know similar ones doing the same things can be written for use on Windows. I can also send you a copy of the basic Libre Office layout I use so you can see how I do things.

Lazeez Jiddan (Webmaster)
Updated:

@StarFleet Carl

Chapter One - yeah, that's okay. Halfway through Chapter Two - why did you just randomly change font color and start adding all sorts of extra crap, such that it took me over an hour to get it cleaned up?

My long experience with receiving hundreds of thousands of documents from many authors, using different tools and processes, taught me that documents edited by multiple people (editors) and anything that went through many changes, will always have a ton of formatting junk hidden in their files. When exported into html, the junk, well hidden by the word processor, may become more evident when a different client is tasked with rendering it. I deal with it every day.

Best way to avoid mishaps:

Once all content editing is done, it's time to dispose of the junk. To do that:

1 - Export the file into plain text, saved as .txt. This step gets rid of the junk, leaving only the contents.

2 - Create a new document in the word processor and import/paste the plain text from the previously exported .txt file.

3 - Use stylesheets when possible to format paragraphs.

4 - Apply the styles you needed (italics, bolding, etc). If you make a mistake, use 'undo'. Don't try to fix it by re-applying correct formatting, that will insert unwanted junk.

That way no hidden formatting to sneak into the final product.

It may be somewhat tedious, but not as much as trying to manually fix a file full of junk formatting and it's waaaayy more reliable to produce the expected result.

StarFleet Carl ๐Ÿšซ
Updated:

@Lazeez Jiddan (Webmaster)

documents edited by multiple people (editors)

It's funny, but I avoid that by only sending out copies of the files. Then I make corrections to my copy myself. I've actually just now found the style section of Libre Office and I'm using it - and it's almost scary, how it's working. I'm probably like most of your authors, just sit down and type, because we're old farts. And at my age, I learned how to type on a manual typewriter - we got electrics while I was IN high school. (My graduation present WAS a very nice electric typewriter.) So I'm just hitting keys and blithely creating the story, and not giving any thought to how to make the story look right. Well, next book will be written with all of this in mind from the very beginning.

ETA: If nothing else, I'm making shorter chapters, because I'm editing out all of those carriage returns that I automatically learned to put in while typing that aren't needed.

Replies:   Keet  Grey Wolf
Keet ๐Ÿšซ
Updated:

@StarFleet Carl

I sometimes wonder why so many authors use LibreOffice or MS Word for writing while the end product is always released as HTML and EPUB (which is basically xHTML).
Wouldn't a simple (HTML) editor be much more convenient as long as the usual spelling check is available?

For SOL you don't need much more than what is available when replying to a post here on the forum :)

Ernest Bywater ๐Ÿšซ

@Keet

I sometimes wonder why so many authors use LibreOffice or MS Word for writing

There's a few reasons, one of which is us being old farts who prefer to use the capabilities of a Word Processor program over the shit we had to live with way back then when all you could type into on a computer was a plain text file. Another is some us, like me, want a finished product that is also a 'print ready' PDF file so we use a word processing program that provides that. I also prefer a word processor to a plain text program as it allows me to see what it will look like in the finished product.

StarFleet Carl ๐Ÿšซ

@Keet

I sometimes wonder why so many authors use LibreOffice or MS Word for writing

Eyesight. 20/40 uncorrected in my right eye (but I need to go in, I think it's down to 20/60), and 20/600 in my left. I regularly run my screens at 125% anyway, as well.

The other thing is that my Epilog Laser is truly WYSIWYG. So I'm MUCH more used to used Corel Draw, and simply sending the output directly to the printer.

And, of course, the other minor detail that we do happen to use our Office programs to actually write REAL paper documents, too, might have something to do with it. I frequently have to create spreadsheets to print to show clients what the best offer we've received is, or use a writer to create letters to send out.

Switch Blayde ๐Ÿšซ
Updated:

@Keet

Wouldn't a simple (HTML) editor be much more convenient as long as the usual spelling check is available?

Who wants to write in HTML? And who's likely to make an HTML coding error, me or a well-tested conversion program?

And keep in mind, if you're sending your novel to a literary agent or publisher it needs to be docx. You have more flexibility as a docx.

I don't know anything about HTML editors, but word processors have a lot of features. That's why they're word processors.

Grey Wolf ๐Ÿšซ

@StarFleet Carl

Same. I send out only copies of the files and do all editing myself. I also work in Scrivener, which only outputs to external formats when I tell it to. Internal files are (I believe) RTF, which has a lot less garbage in it.

I have not tried EPUB export, but there's reason to believe it will work fairly well, at least once I jump through some hoops once (as I did with docx output).

Replies:   Keet
Keet ๐Ÿšซ

@Grey Wolf

Internal files are (I believe) RTF, which has a lot less garbage in it.

Uhm, RTF can have a LOT of garbage in it. In some cases the style codes exceed the amount of actual text...

Replies:   Grey Wolf
Grey Wolf ๐Ÿšซ

@Keet

Fair enough. I'm not sure this would, but then I also don't feel like digging into the component files and looking...

Ernest Bywater ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

Export the file into plain text, saved as .txt. This step gets rid of the junk, leaving only the contents.

While this process is good for many people it's a major issue for me due to the many formats I use within my story with bold, italics, centred text, indented text, and coloured text. This would all be lost in the above process and would require a lot of work to reformat as I want it.

However, for those who don't use such an extensive amount of formats would find the advice very useful.

StarFleet Carl ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

In the for what it's worth department, I have part of the next chapter already written, for the next book. I did this procedure on that, to see what it'd do. I still got errors with Flight Crew while trying to validate it as an EPUB2. I then did an export as an EPUB3, and the basic part of Sigil validated it, but Flight Crew looked at it and puked - it only does EPUB2 documents.

I took the copy of the modified version you sent me. and ran Flight Crew with Sigil on the copy YOU sent me, it gave me multiple font errors. But your copy looked really close to the EPUB3 file that was generated. I'm now totally confused, but that's not uncommon for me at this point.

Replies:   Keet
Keet ๐Ÿšซ

@StarFleet Carl

I'm now totally confused, but that's not uncommon for me at this point.

Happens to all of us every once in a while ;)

Keet ๐Ÿšซ

@StarFleet Carl

Thank you for the reasons to use LO or Word. I understand you use what you know and take advantage of the extended formatting options. I'm an old fart too and use LibreOffice for 'normal' documents. But for HTML, CSS, PHP, and JavaScript I mostly use a plain text or HTML editor (gedit, Bluefish). They offer spelling check, code coloring and other facilities like code completion too through plugins. I must see what the HTML code itself is which can't be done with LO. I'm not an author so publishing is not a consideration for me. PDF is, because that's how most documents go to clients via email.
If you regularly have problems with creating EPUBs it's a very good alternative.

StarFleet Carl ๐Ÿšซ

@StarFleet Carl

Screw this - I'm 'cheating'.

There's a very nice EPUB document sitting online - thank you, Lazeez. It'll take me some time, but I'm going to use THAT as my template, and simply stick everything from my stories into THAT.

More importantly - I'll use THAT from the beginning for Book 5, so ALL of the chapters will be done properly. (I frigging well HOPE!)

Replies:   JoeBobMack  Paladin_HGWT
JoeBobMack ๐Ÿšซ

@StarFleet Carl

"If you ain't cheatin', you ain't tryin'!"

Paladin_HGWT ๐Ÿšซ

@StarFleet Carl

Is there a Link to the EPUB template provided by Lazeez? Or perhaps I am thinking less clearly than usual.

I recently noticed some odd effects that are afflicting my most recent chapter.

I have been using the guides in the AUthors section, but I am not aware of any "Template"?

Switch Blayde ๐Ÿšซ

@Paladin_HGWT

Is there a Link to the EPUB template provided by Lazeez?

I could be wrong (I often am), but I think he saying he'll take the epub generated by Lazeez and modify it to use as a template for his future books.

Lazeez Jiddan (Webmaster)

@Paladin_HGWT

EPUB template provided by Lazeez

There was no 'Template'. Carl submitted an ebook and I tried to validate it, it had a ton of xhtml error (there was a lot of HTML3 code in it) and I fixed it up. Nothing useful in the result for use later. I simply cleaned up the text's formatting removing all the < font> tags etc...

For info about how to go about it:

I have a Mac. I have installed 'Homebrew' on it (free open source software). Homebrew allows you to install Unix software easily.

There is a unix package called 'epubcheck', I installed it using homebrew and I use it to validate EPUB files before approving them for Bookapy. Anybody with a Mac or Linux box can install epubcheck and use it through the command line.

If anything needing fixes, I open the EPUB in BBEdit (powerful Mac text editor) and go through what is needed to be fixed. BBEdit allows you to open Zip archives and edit the files inside including EPUB-wide GREP search and replace. An EPUB is a zip archive so it opens in BBEdit easily and perfectly. Any edits are saved to the compressed files as though it's a regular text file.

If epubcheck complains about internal file URLs (internal files shouldn't have spaces or weird characters in their names), I open the EPUB in calibre's ebook editor and fix those as it's easier.

StarFleet Carl ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

There was no 'Template'.

That's correct. What I'm doing is taking the EPUB that SOL ALREADY has for the online work, and simply modifying it. I've spent quite a bit of time over the last couple of days studying it.

I have no idea HOW they work, but I'm now seeing what Ernest and Laz were talking about with CSS and Styles. This is cool, because I never knew you could do some of this stuff with HTML.

Paladin_HGWT ๐Ÿšซ

@Lazeez Jiddan (Webmaster)

Thank you very much for the explanation.

I am far from submitting anything for EPUB.

I am still grateful to have a proofreader; eventually I will be looking for an editor. Until I at least complete the first book, another 25 chapters I estimate; I am not really even considering EPUB.

I writing Aztlan Portal as much to get feedback on my writing in general. I think I have learned quite a bit during the year I have been posting chapters.

I consulted the writing guides in the Author's section of SOL, as well as several books about self-editing, and other resources before I began putting chapters up on SOL.

The possibility of an EPUB Template is worth investigating.

I have learned several things about how to better compose chapters so they appear on SOL more like how I want them to. I have revised individuals chapters to improve them. Writing is a constant learning process.

Back to Top

 

WARNING! ADULT CONTENT...

Storiesonline is for adult entertainment only. By accessing this site you declare that you are of legal age and that you agree with our Terms of Service and Privacy Policy.


Log In