Discussion:
[Synce-devel] trayicon segfaults in odccm-client.c
Iain Buchanan
2007-09-27 01:28:06 UTC
Permalink
Hi all!

I load trayicon and then plug in my device. It then segfaults at line
630 of odccm-client.c:

/* get rapi connection */
if (!(odccm_device_get_rapi_connection(self, device))) {
g_object_unref(device); // segfaults here!!
goto exit;
}

I tried wrapping g_object_unref like this:
if (device) g_object_unref(device);
but it still failed!

It seems to hinge on this message:

** (synce-trayicon:3918): CRITICAL **: get_device_name_via_rapi:
CeRegOpenKeyEx failed getting device name

and then

*** glibc detected *** synce-trayicon: munmap_chunk(): invalid pointer:
0x0810c9e0 ***


attached is the output from synce-trayicon -f. Actually, sometimes this
brings up bug-buddy, sometimes it doesn't, so I don't know what's going
on there, but in either case, the trayicon is unresponsive until I kill
it (then it restarts itself).

And also, once it has restarted, it doesn't recognise my device.

Any suggestions? thanks!
--
Iain Buchanan <iaindb at netspace dot net dot au>

One man's constant is another man's variable.
-- A.J. Perlis
David Eriksson
2007-09-27 20:04:05 UTC
Permalink
Post by Iain Buchanan
Hi all!
I load trayicon and then plug in my device. It then segfaults at line
/* get rapi connection */
if (!(odccm_device_get_rapi_connection(self, device))) {
g_object_unref(device); // segfaults here!!
goto exit;
}
The latest version of odccm-client.c does not look like that. Please try
the trayicon module from the Subversion repository.

\David


--
Iain Buchanan
2007-09-27 23:59:28 UTC
Permalink
Post by David Eriksson
Post by Iain Buchanan
Hi all!
I load trayicon and then plug in my device. It then segfaults at line
/* get rapi connection */
if (!(odccm_device_get_rapi_connection(self, device))) {
g_object_unref(device); // segfaults here!!
goto exit;
}
The latest version of odccm-client.c does not look like that. Please try
the trayicon module from the Subversion repository.
is this the right repository?

https://synce.svn.sourceforge.net/svnroot/synce/trunk/trayicon

because it says I'm:

At revision 3030.

and the source still has the above at lines 628 - 632. Am I doing
something wrong?

thanks,
--
Iain Buchanan <iaindb at netspace dot net dot au>

Airplanes are interesting toys but of no military value.
-- Marechal Ferdinand Foch, Professor of Strategy,
Ecole Superieure de Guerre
David Eriksson
2007-09-28 04:37:23 UTC
Permalink
Post by Iain Buchanan
Post by David Eriksson
Post by Iain Buchanan
Hi all!
I load trayicon and then plug in my device. It then segfaults at line
/* get rapi connection */
if (!(odccm_device_get_rapi_connection(self, device))) {
g_object_unref(device); // segfaults here!!
goto exit;
}
The latest version of odccm-client.c does not look like that. Please try
the trayicon module from the Subversion repository.
is this the right repository?
https://synce.svn.sourceforge.net/svnroot/synce/trunk/trayicon
At revision 3030.
and the source still has the above at lines 628 - 632. Am I doing
something wrong?
Sorry, I was looking at line 442.

\David
--
Iain Buchanan
2007-09-28 02:59:30 UTC
Permalink
Post by Iain Buchanan
Hi all!
I load trayicon and then plug in my device. It then segfaults at line
[snip]
Does it always do this, or is it intermittent ?
It always segfaults, yes.
Unfortunately I don't
have time at the moment to do much, but I'll give you some ideas in case
you can try yourself.
I _can_ try myself, but that doesn't mean I'll get anywhere :)

I'm having a hard time getting the line number from gdb though... it
may in fact be from line 451, since odccm_device_get_rapi_connection is
called from two places. This is line 451:

goto exit;
error_exit:
g_object_unref(new_proxy);
if (device) g_object_unref(device); // here!
exit:
return;
}

It seems that get_device_name_via_rapi fails with the message
"CeRegOpenKeyEx failed getting device name", and so returns NULL. And
because that fails, odccm_device_get_rapi_connection returns FALSE, and
therefore odccm_device_connected_cb tries to g_object_unref(device).

If I comment out this line "if (device) g_object_unref(device);" then
trayicon doesn't segfault, but it still doesn't detect the device when I
plug it in (makes sense). However, it _does_ detect the name when I
unplug it. Attached is the output with the unref line commented, with
some line breaks just before I plugged the device in, and just before I
unplugged it.

That's about as far as I got...
I suspect there are two different problems, one that causes the name
lookup failure below, and another that causes a segfault when unreffing
the device object.
I think it's because it can't get the name that it causes this chain of
events.
Taking the unref action, my best guess is that when the object is
finalized in device.c, it tries to free an invalid pointer.
Not enough information to speculate on the name lookup failure. What
dccm are you using ? Any useful output from it ?
odccm. It shows this when I plug in my device:

** (odccm:30213): DEBUG: device_info_received
** (odccm:30213): DEBUG: 82 36 34 2c 45 c6 87 98 67 5f 40 1f 60 86 6d d0
05 00 00 00 01 00 00 00 03 00 00 00 49 00 4f 00 32 00 00 00 05 01 c3 00
11 0a 00 00 05 00 00 00 7e f7 98 00 5c 38 ce 60 0f 00 00 00 50 6f 63 6b
65 74 50 43 00 53 53 44 4b 00 00 08 00 00 00 58 64 61 20 41 74 6f 6d 00
02 00 00 00 04 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00
** (odccm:30213): DEBUG: device_info_received: registering object path
'/org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_'

the hex translates to this:

ï¿œ64,EƇᅵg_@`ï¿œmï¿œIO2ï¿œ
~ᅵᅵ\8ᅵ`PocketPCSSDXda Atom

IO2 is my devices name, and it is an Xda. So odccm is seeing the name.

trayicon gets the same device info string:

** (synce-trayicon:26369): DEBUG: odccm_device_connected_cb: Received
connect from
odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_

[snip]
Hmm, I might make the -f option not connect to the session manager,
would stop unexpected restarts when debugging.
Post by Iain Buchanan
And also, once it has restarted, it doesn't recognise my device.
Any suggestions? thanks!
It won't recognise an already connected device on startup, it's on my
(growing) list :)
Try the attached patch to main.c for not restarting when the -f option
is specified. There are lots of ways it could have been done, this is
just one :)

What else on your todo list is this easy?!!

Well, sorry about the huge email, but thanks for any help you can
continue to provide!
--
Iain Buchanan <iaindb at netspace dot net dot au>

"Just Say No." - Nancy Reagan

"No." - Ronald Reagan
Mark Ellis
2007-09-28 06:21:01 UTC
Permalink
Post by Iain Buchanan
Post by Iain Buchanan
Hi all!
I load trayicon and then plug in my device. It then segfaults at line
[snip]
Does it always do this, or is it intermittent ?
It always segfaults, yes.
Unfortunately I don't
have time at the moment to do much, but I'll give you some ideas in case
you can try yourself.
I _can_ try myself, but that doesn't mean I'll get anywhere :)
I'm having a hard time getting the line number from gdb though... it
may in fact be from line 451, since odccm_device_get_rapi_connection is
goto exit;
g_object_unref(new_proxy);
if (device) g_object_unref(device); // here!
return;
}
Ok, just put some g_debug or printf statements before and after those
two calls to odccm_device_get_rapi_connection to identify where we are
going wrong.
Post by Iain Buchanan
It seems that get_device_name_via_rapi fails with the message
"CeRegOpenKeyEx failed getting device name", and so returns NULL. And
because that fails, odccm_device_get_rapi_connection returns FALSE, and
therefore odccm_device_connected_cb tries to g_object_unref(device).
If I comment out this line "if (device) g_object_unref(device);" then
trayicon doesn't segfault, but it still doesn't detect the device when I
plug it in (makes sense). However, it _does_ detect the name when I
unplug it. Attached is the output with the unref line commented, with
some line breaks just before I plugged the device in, and just before I
unplugged it.
That's about as far as I got...
I suspect there are two different problems, one that causes the name
lookup failure below, and another that causes a segfault when unreffing
the device object.
I think it's because it can't get the name that it causes this chain of
events.
Indeed, the cause is lack of name, which is in fact lack of connection
I'd guess.
Post by Iain Buchanan
Taking the unref action, my best guess is that when the object is
finalized in device.c, it tries to free an invalid pointer.
Not enough information to speculate on the name lookup failure. What
dccm are you using ? Any useful output from it ?
** (odccm:30213): DEBUG: device_info_received
** (odccm:30213): DEBUG: 82 36 34 2c 45 c6 87 98 67 5f 40 1f 60 86 6d d0
05 00 00 00 01 00 00 00 03 00 00 00 49 00 4f 00 32 00 00 00 05 01 c3 00
11 0a 00 00 05 00 00 00 7e f7 98 00 5c 38 ce 60 0f 00 00 00 50 6f 63 6b
65 74 50 43 00 53 53 44 4b 00 00 08 00 00 00 58 64 61 20 41 74 6f 6d 00
02 00 00 00 04 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00
** (odccm:30213): DEBUG: device_info_received: registering object path
'/org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_'
~��\8�`PocketPCSSDXda Atom
IO2 is my devices name, and it is an Xda. So odccm is seeing the name.
** (synce-trayicon:26369): DEBUG: odccm_device_connected_cb: Received
connect from
odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_
[snip]
Hmm, I might make the -f option not connect to the session manager,
would stop unexpected restarts when debugging.
Post by Iain Buchanan
And also, once it has restarted, it doesn't recognise my device.
Any suggestions? thanks!
It won't recognise an already connected device on startup, it's on my
(growing) list :)
Try the attached patch to main.c for not restarting when the -f option
is specified. There are lots of ways it could have been done, this is
just one :)
What else on your todo list is this easy?!!
Cool, thanks ! Unfortunately the list is in my head, must write it
down.....
Post by Iain Buchanan
Well, sorry about the huge email, but thanks for any help you can
continue to provide!
This is all good !

Mark

PS Is it just me or did it suddenly get very busy around here ?
Iain Buchanan
2007-10-01 07:28:06 UTC
Permalink
Post by Mark Ellis
Post by Iain Buchanan
I'm having a hard time getting the line number from gdb though... it
may in fact be from line 451, since odccm_device_get_rapi_connection is
goto exit;
g_object_unref(new_proxy);
if (device) g_object_unref(device); // here!
return;
}
Ok, just put some g_debug or printf statements before and after those
two calls to odccm_device_get_rapi_connection to identify where we are
going wrong.
yeah did that, but I couldn't print anything useful. All I discovered
was that is really the line! I see a message before the unref, then the
segfault, and I don't see the message after.

more hints?
Post by Mark Ellis
PS Is it just me or did it suddenly get very busy around here ?
umm, didn't notice lots more noise, apart from my own emails :) But
busy is good IMHO, means more uptake of synce :)

thanks,
--
Iain Buchanan <iaindb at netspace dot net dot au>

The party adjourned to a hot tub, yes. Fully clothed, I might add.
-- IBM employee, testifying in California State Supreme Court
Mark Ellis
2007-10-03 06:13:36 UTC
Permalink
Post by Iain Buchanan
Post by Mark Ellis
Post by Iain Buchanan
I'm having a hard time getting the line number from gdb though... it
may in fact be from line 451, since odccm_device_get_rapi_connection is
goto exit;
g_object_unref(new_proxy);
if (device) g_object_unref(device); // here!
return;
}
Ok, just put some g_debug or printf statements before and after those
two calls to odccm_device_get_rapi_connection to identify where we are
going wrong.
Do you have a password on your device ? I'm still thinking, hopefully
send you a little debug patch soon.
Post by Iain Buchanan
yeah did that, but I couldn't print anything useful. All I discovered
was that is really the line! I see a message before the unref, then the
segfault, and I don't see the message after.
more hints?
Post by Mark Ellis
PS Is it just me or did it suddenly get very busy around here ?
umm, didn't notice lots more noise, apart from my own emails :) But
busy is good IMHO, means more uptake of synce :)
Definitely !

Mark
Mark Ellis
2007-10-03 21:01:44 UTC
Permalink
Post by Mark Ellis
Do you have a password on your device ? I'm still thinking, hopefully
send you a little debug patch soon.
Try the attached patch, hopefully it will prevent the segfault. If that
works ok we can start to figure out why the connection is a bit flaky.

Mark
Iain Buchanan
2007-10-04 05:20:01 UTC
Permalink
Post by Mark Ellis
Post by Mark Ellis
Do you have a password on your device ? I'm still thinking, hopefully
send you a little debug patch soon.
nope, no password. Used to, but I took it off to debug some other synce
things :)
Post by Mark Ellis
Try the attached patch, hopefully it will prevent the segfault. If that
works ok we can start to figure out why the connection is a bit flaky.
heh... still crashes:

** (synce-trayicon:18918): DEBUG: odccm_device_connected_cb: Received connect from odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_

** (synce-trayicon:18918): CRITICAL **: get_device_name_via_rapi: CeRegOpenKeyEx failed getting device name
*** glibc detected *** synce-trayicon: double free or corruption (out): 0x08104ed0 ***

I've attached the backtrace.

I put in a few sleeps just in case it was reacting to some message from
odccm before odccm actually had the name, or something... just stabbing
in the dark but it didn't work!

It's coming up to the weekend - I could probably nut out some time on
irc late one night if you want to line up our timezones. Might be a bit
more efficient than one message per day! Up to you.

thanks for the help,
--
Iain Buchanan <iaindb at netspace dot net dot au>

<aether> sleep is for the weak
<plasmaroo> aether++
<tseng> aether--
<aether> My +1ness was short, but well worth it
Mark Ellis
2007-10-04 06:17:22 UTC
Permalink
Post by Iain Buchanan
Post by Mark Ellis
Try the attached patch, hopefully it will prevent the segfault. If that
works ok we can start to figure out why the connection is a bit flaky.
I've attached the backtrace.
That backtrace could be remarkably useful, try the new patch attached.
Post by Iain Buchanan
It's coming up to the weekend - I could probably nut out some time on
irc late one night if you want to line up our timezones. Might be a bit
more efficient than one message per day! Up to you.
thanks for the help,
Unfortunately on highly unreliable dial up at weekends (women just dont
understand sometimes !).

No worries, hope this works.

Mark
Iain Buchanan
2007-10-04 07:07:06 UTC
Permalink
Post by Mark Ellis
Post by Iain Buchanan
Post by Mark Ellis
Try the attached patch, hopefully it will prevent the segfault. If that
works ok we can start to figure out why the connection is a bit flaky.
I've attached the backtrace.
That backtrace could be remarkably useful, try the new patch attached.
Don't have time to analyse this one (not so good at it anyway) as I'm
about to go...

here is the output, backtrace attached.

** (synce-trayicon:26141): DEBUG: odccm_device_connected_cb: Received
connect from
odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_

** (synce-trayicon:26141): CRITICAL **: get_device_name_via_rapi:
CeRegOpenKeyEx failed getting device name
*** glibc detected *** synce-trayicon: free(): invalid pointer:
0x080e9b78 ***

may not be able to get on to it again till tomorrow - have to ride home
now :)

thanks & cya,
--
Iain Buchanan <iaindb at netspace dot net dot au>

Caution: breathing may be hazardous to your health.
Mark Ellis
2007-10-05 06:19:35 UTC
Permalink
Post by Iain Buchanan
Post by Mark Ellis
Post by Iain Buchanan
Post by Mark Ellis
Try the attached patch, hopefully it will prevent the segfault. If that
works ok we can start to figure out why the connection is a bit flaky.
I've attached the backtrace.
That backtrace could be remarkably useful, try the new patch attached.
Don't have time to analyse this one (not so good at it anyway) as I'm
about to go...
here is the output, backtrace attached.
** (synce-trayicon:26141): DEBUG: odccm_device_connected_cb: Received
connect from
odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_
CeRegOpenKeyEx failed getting device name
0x080e9b78 ***
may not be able to get on to it again till tomorrow - have to ride home
now :)
thanks & cya,
Try this one.

Mark
Iain Buchanan
2007-10-09 01:41:14 UTC
Permalink
Post by Mark Ellis
Try this one.
woohoo, no crash! But of course the device still isn't recognised:

$ synce-trayicon -f
** (synce-trayicon:23083): DEBUG: Running in foreground
** (synce-trayicon:23083): DEBUG: module_load_all: loading
module /usr/lib/synce-trayicon/modules/gnomevfs-trayicon-module.so
** (synce-trayicon:23083): DEBUG: module_load_all: loading
module /usr/lib/synce-trayicon/modules/test-mod.so
** (synce-trayicon:23083): DEBUG: g_module_check_init: running from
trayicon test module for /usr/lib/synce-trayicon/modules/test-mod.so

(and when I plug in the device)

** (synce-trayicon:23083): DEBUG: odccm_device_connected_cb: Received
connect from
odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_
** (synce-trayicon:23083): CRITICAL **: get_device_name_via_rapi:
CeRegOpenKeyEx failed getting device name

(and when I unplug the device)

** (synce-trayicon:23083): WARNING **: odccm_device_disconnected_cb:
Received disconnect from odccm from unfound
device: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_

thanks,
--
Iain Buchanan <iaindb at netspace dot net dot au>

BOFH Excuse #260:

We're upgrading /dev/null
Mark Ellis
2007-10-09 08:49:58 UTC
Permalink
Post by Iain Buchanan
Post by Mark Ellis
Try this one.
Cool, that's the trayicon bug fixed.
Post by Iain Buchanan
$ synce-trayicon -f
** (synce-trayicon:23083): DEBUG: Running in foreground
** (synce-trayicon:23083): DEBUG: module_load_all: loading
module /usr/lib/synce-trayicon/modules/gnomevfs-trayicon-module.so
** (synce-trayicon:23083): DEBUG: module_load_all: loading
module /usr/lib/synce-trayicon/modules/test-mod.so
** (synce-trayicon:23083): DEBUG: g_module_check_init: running from
trayicon test module for /usr/lib/synce-trayicon/modules/test-mod.so
(and when I plug in the device)
** (synce-trayicon:23083): DEBUG: odccm_device_connected_cb: Received
connect from
odccm: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_
CeRegOpenKeyEx failed getting device name
(and when I unplug the device)
Received disconnect from odccm from unfound
device: /org/synce/odccm/Device/_2C343682_C645_9887_675F_401F60866DD0_
thanks,
Do the command line tools work ? What version of Windows Mobile is it ?

Mark
Iain Buchanan
2007-10-10 00:54:00 UTC
Permalink
Post by Mark Ellis
Do the command line tools work ?
yes -pls pstatus etc. and gnome vfs all work.
Post by Mark Ellis
What version of Windows Mobile is it ?
WM5, with some service pack or upgrade. Although pstatus says "Windows
CE" - don't know if this is the same as WM5.

$ pstatus
Version
=======
Version: 5.1.195 (Unknown)
Platform: 3 (Windows CE)
Details: ""

System
======
Processor architecture: 5 (ARM)
Processor type: 2577 (StrongARM)
Page size: 0x10000

note that pls doesn't work as soon as eth1 comes up. it says:

** (process:5451): WARNING **: No devices connected to odccm
pls: Could not find configuration at path '(Default)'

but then it works again a few seconds later. Could be a red herring :)

synce-trayicon _used_ to work - don't know what version though :) It
could have even been before you started working on it...

thanks,
--
Iain Buchanan <iaindb at netspace dot net dot au>

"Consider a spherical bear, in simple harmonic motion..."
-- Professor in the UCB physics department
Mark Ellis
2007-10-12 06:11:42 UTC
Permalink
Post by Iain Buchanan
Post by Mark Ellis
Do the command line tools work ?
yes -pls pstatus etc. and gnome vfs all work.
Post by Mark Ellis
What version of Windows Mobile is it ?
WM5, with some service pack or upgrade. Although pstatus says "Windows
CE" - don't know if this is the same as WM5.
$ pstatus
Version
=======
Version: 5.1.195 (Unknown)
Platform: 3 (Windows CE)
Details: ""
System
======
Processor architecture: 5 (ARM)
Processor type: 2577 (StrongARM)
Page size: 0x10000
Thats a shame, if it was pre WM5 I had an idea :)
Post by Iain Buchanan
** (process:5451): WARNING **: No devices connected to odccm
pls: Could not find configuration at path '(Default)'
but then it works again a few seconds later. Could be a red herring :)
Probably not going to help. Odccm not advertising devices implies it's
still negotiating the connection. Still might be worth trying something,
more thinking needed.

Mark
Mark Ellis
2007-10-17 06:07:03 UTC
Permalink
Post by Mark Ellis
Post by Iain Buchanan
** (process:5451): WARNING **: No devices connected to odccm
pls: Could not find configuration at path '(Default)'
but then it works again a few seconds later. Could be a red herring :)
Probably not going to help. Odccm not advertising devices implies it's
still negotiating the connection. Still might be worth trying something,
more thinking needed.
But on the other hand it doesn't hurt to try :)

Attached patch just sleeps for 10secs between getting the connection
message and trying a rapi connection.

I've also committed what we have so far.

Thanks
Mark

Loading...