Discussion:
[Synce-devel] pwi2html bugs
A Anopheles
2006-09-21 16:28:18 UTC
Permalink
Hi guys

Dont know if this is the correct forum for this
(I tracked you down via David's name, claiming ownership of the binarypwiparser.cpp file below)
but here goes:

I don't see much activity surrounding pwi2html except a lot of spam messages on the home page; however today I really needed to recover some Pocket Word files (*.PWI) and discovered:
(a) pwi2html ... :-) !!!
(b) it doesn't work ... :-( !
on my ubuntu Breezy system, on PWI files created on Ipaq running Windows Mobile 2005

There are 2 problems.
(1) it always hangs, burning 100% CPU. Output file not created.

This happens in binarypwiparser.cpp because while scanning paragraphs the start_offset unexpectedly becomes -ve. I suppose this might be due to a change in the PWI contents on Mobile 2005?

After adding a fix for this i.e. break out of the loop when the offset < 0,
I can successfully do WPI -> HTML conversion. BUT another problem:
(2) files created in the default output 'openoffice' format sxw cannot be loaded into openoffice2.

The conversion to HTML is ok, which I can live with.

Has anyone else experienced this?
Can anyone suggest what might be wrong with the OpenOffice conversion?




-------------------------------
(temporary) fix for binaryparser.cpp
This is a larger patch than you might expect because it also fixes lots of broken code that is only invoked if you #define VERBOSE=1
(as you might do if debugging ... :-) )
The (trivially) interesting stuff is near line 138

------------------------------
--- binarypwiparser.cpp.broken 2006-09-21 16:52:35.000000000 +0100
+++ binarypwiparser.cpp 2006-09-21 16:52:49.000000000 +0100
@@ -17,6 +17,8 @@
* Free Software Foundation, Inc., *
* 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
***************************************************************************/
+//#define VERBOSE 1
+
#include "binarypwiparser.h"
#include "paragraph.h"
#include <iostream>
@@ -68,7 +72,7 @@
this->inputStream->seekg(0xa, std::ios_base::cur);

#if VERBOSE
- std::cerr << "Font table starts at offset 0x" << hex << input.tellg() << std::endl;
+ std::cerr << "Font table starts at offset 0x" << hex << inputStream->tellg() << std::endl;
#endif

// Skip font table
@@ -76,7 +80,7 @@

#if VERBOSE

- cerr << "Font table ends at offset 0x" << hex << input.tellg() << endl;
+ cerr << "Font table ends at offset 0x" << hex << inputStream->tellg() << endl;
#endif
// Skip to paragraph count 1
this->inputStream->seekg(0xba, std::ios_base::cur);
@@ -101,7 +105,7 @@

this->inputStream->seekg(0x6, std::ios_base::cur);

- //cerr << "Paragraph index starts at offset 0x" << hex << input.tellg() << endl;
+ //cerr << "Paragraph index starts at offset 0x" << hex << inputStream->tellg() << endl;

#if 0

@@ -120,7 +124,7 @@
this->inputStream->seekg(paragraph_count * PARAGRAPH_ENTRY_SIZE, std::ios_base::cur);
#endif

- //cerr << "Paragraph index ends at offset 0x" << hex << input.tellg() << endl;
+ //cerr << "Paragraph index ends at offset 0x" << hex << inputStream->tellg() << endl;

this->inputStream->seekg(2, std::ios_base::cur);

@@ -134,6 +138,10 @@
#if VERBOSE
std::cerr << "Paragraph " << std::dec << i << " starts at offset 0x" << std::hex << this->inputStream->tellg() << std::endl;
#endif
+ if (this->inputStream->tellg() < 0) {
+ std::cerr << "WARNING: abandoned paragraphs when start_offset became -ve" << std::endl;
+ break;
+ }

string crtString;
unsigned code = read16();
@@ -178,7 +186,7 @@
// this->inputStream->seekg(4 - ((this->inputStream->tellg() - start_offset) & 3), std::ios_base::cur); // FIXME: what is this ?

#if VERBOSE
- cerr << "Paragraph " << dec << i << " ends at offset 0x" << hex << input.tellg() << endl;
+ cerr << "Paragraph " << dec << i << " ends at offset 0x" << hex << inputStream->tellg() << endl;
#endif

}
@@ -191,7 +199,7 @@
;

#if VERBOSE
- cerr << "Decoding ends at offset 0x" << hex << input.tellg() << endl;
+ cerr << "Decoding ends at offset 0x" << hex << inputStream->tellg() << endl;
#endif

}
@@ -411,7 +419,7 @@
*/
void BinaryPwiParser::align(std::streampos start_offset) {
//cerr << "Start offset: 0x" << hex << start_offset << endl;
- //cerr << "Current offset: 0x" << hex << input.tellg() << endl;
+ //cerr << "Current offset: 0x" << hex << inputStream->tellg() << endl;
unsigned align = 4 - ((this->inputStream->tellg() - start_offset) & 3);
//cerr << "Align: " << dec << align << endl;
this->inputStream->seekg(align, std::ios_base::cur);
--
_______________________________________________
Surf the Web in a faster, safer and easier way:
Download Opera 9 at http://www.opera.com

Powered by Outblaze
David Eriksson
2006-09-21 17:47:09 UTC
Permalink
Post by A Anopheles
Hi guys
Dont know if this is the correct forum for this
(I tracked you down via David's name, claiming ownership of the binarypwiparser.cpp file below)
(a) pwi2html ... :-) !!!
(b) it doesn't work ... :-( !
on my ubuntu Breezy system, on PWI files created on Ipaq running Windows Mobile 2005
There are 2 problems.
(1) it always hangs, burning 100% CPU. Output file not created.
This happens in binarypwiparser.cpp because while scanning paragraphs the start_offset unexpectedly becomes -ve. I suppose this might be due to a change in the PWI contents on Mobile 2005?
After adding a fix for this i.e. break out of the loop when the offset < 0,
(2) files created in the default output 'openoffice' format sxw cannot be loaded into openoffice2.
The conversion to HTML is ok, which I can live with.
Has anyone else experienced this?
Can anyone suggest what might be wrong with the OpenOffice conversion?
Hi,

All attempts at PWI support I know if (including those my own making)
simply suck for anything but the simplest case.

I actually suggest that you either hire a skillful reverse engineer for
a few months... or use a Windows PC for your needs.

\David
A Anopheles
2006-09-22 06:47:24 UTC
Permalink
Post by David Eriksson
All attempts at PWI support I know if (including those my own making)
simply suck for anything but the simplest case.
I actually suggest that you either hire a skillful reverse engineer for
a few months... or use a Windows PC for your needs.
I'm not sure a windows PC is going to help. My Ipaq has died leaving me with some PWI files on a compact flash card. Can I run an app on Windows to decode them in the absense of the Ipaq? I'm sure I read somewhere that MS Word cant open them.

Fortunately, the 'simplest case' is all I require: my PWI files all contain unformatted simple text & I don't care about formatting or fonts, so the [patched] pwi2html might well suffice.

Thanks anyway.
--
_______________________________________________
Surf the Web in a faster, safer and easier way:
Download Opera 9 at http://www.opera.com

Powered by Outblaze
Loading...