As I mentioned in other threads I've started creating my e-pubs from html code because they make much smaller files than when you create from the word processor file. In the process I've learnt a few tricks I'll list below after describing my process and past problems.
My Process
I write all of my stories in a word processor program (Libre Office) using paragraph styles of title heading, heading 1, heading 2, heading 5, quotation, quotation2, centered, preformated text, centred, default style. There a re a few others like footers etc I'm leaving out as not relevant for this. I write it as an ODT file.
Once the story is written I also save it as PDF file, and then a 'Save as HTML' option HTML file. I clean up the HTML to save the file as a HTML file for SoL and a HTML file to create the e-pub from. I use the last as my personal website file as well.
Past problems
Very large e-pub files due to being created from the word processing file. After discussions with people more used to using the xhtml used in creating e-pubs I found the program I use to create the e-pubs (Calibre) does the same silly thing as the 'Save as HTML' function of the word processor in that it provides an excessive amount of format code within the file.
In the 'Save as HTML' option the system each paragraph has a hell of a lot of format code with it (all word processors do this). Much of this format code is needed to ensure the finished file displays the way I want it to. Instead of leaving all of the code in the file I create a style sheet and identify each paragraph as the proper style sheet type while I delete all of the format code within each paragraph. This is essentially the HTML version of assigning paragraph styles within the word processor.
As well as using the style sheet to assign paragraph format code I also use it to provide text color codes and alignment codes.
NB: it was moving all of this paragraph code from each paragraph of the file to the style sheet that cut down on the size of the file.
........................
Recent Findings
In the last few weeks I found part of the issue was that the Calibre program assigns style codes for everything in every e-pub split that isn't in the overall style sheet list. A split is where they assign a chapter break, so think of it as each chapter. Thus the e-pub style sheet will create a new additional style sheet for each chapter to list each type of text that isn't in the main style sheet.
After I went to a lot of trouble to ensure that every paragraph style was covered in the main style sheet I was surprised to see Calibre still creating split style sheets. On further examination I found the system would read and use standard HTML code it preferred everything to be in the style sheet and you used span commands for every text style usage other than the standard text style. This way everything was in the main style sheet.
When I added two text style commands to the style sheet and applied them using the < s p a n c l a s s = " xxx" > < / s p a n > commands around the text the system no longer created split style sheets as everything was now in the main style sheet. This further reduced the overall size of the e-pub file.
Thus while the SoL HTML file has < i > < / i > and < b > < / b > the e-pub HTML files now has < s p a n c l a s s = " i t a l i c " > and < s p a n c l a s s = " b o l d " > with < / s p a n > to close them in the same way I use the commands to apply color to the text.
...................
I'm passing this information along to those who feel they may find it to be of use to them in creating their own e-pubs
The lesson to take away from this is to have every type of text format set out in the main style sheet.
typo edit.