[OPEN-ILS-DEV] Debugging OpenSRF installation

Dan Wells dbw2 at calvin.edu
Thu Jun 18 13:03:12 EDT 2009


Hello Victoria,

The default Evergreen install has many services running on the public interface, so I think you will be fine.  In fact, that is almost certainly the reason nobody else ran into this bug before.  In the course of doing a standard Evergreen install you will end up replacing the entire opensrf_core.xml.

DW

>>> Victoria Bush <vbush at ilstu.edu> 6/18/2009 12:32 PM >>>
You know, it would help if I wrote coherently.

I mean, should I continue the Evergreen installation? Everything  
works, so I'm assuming yes. But I'm wondering if the single-service  
bug here is going to cause problems.

-Vicki


On Jun 18, 2009, at 11:06 AM, Victoria Bush wrote:

> BINGO! That worked! I tried it both by starting each individual  
> service and then by stopping everything and doing osrf_ctl.sh -l -a  
> start_all. All good.
>
> The question then becomes: shall I go ahead and continue with the  
> Evergreen server in my case? Are there are situations where only one  
> service is loaded to an interface at a time?
>
> -Vicki
>
>
>
> On Jun 18, 2009, at 10:26 AM, Dan Wells wrote:
>
>> Hello all,
>>
>> Well, I went back to square one and was able to reproduce this  
>> buggy behavior, so Victoria is not alone!
>>
>> Furthermore, I think I have a lead for the developers to follow in  
>> fixing this.  It seems that the following default config in  
>> opensrf_core.xml is not being parsed as valid:
>>
>> <opensrf>
>>   <routers>
>>     <router>
>>       <name>router</name>
>>       <domain>public.localhost</domain>
>>       <services>
>>           <service>opensrf.math</service>
>>       </services>
>> ...
>>
>> Since the math service was working fine on my full install, I  
>> noticed that the only difference in this section was that many more  
>> services were attached to the public.localhost router.  As a basic  
>> test, I added another fake service line, as follows:
>>
>> ...
>>       <services>
>>           <service>opensrf.math</service>
>>           <service>opensrf.blah</service>
>>       </services>
>> ...
>>
>> Bingo!  opensrf.math now tested fine when logging into  
>> public.localhost.  I also tested a case of simply doubling the  
>> opensrf.math line:
>>
>> ...
>>       <services>
>>           <service>opensrf.math</service>
>>           <service>opensrf.math</service>
>>       </services>
>> ...
>>
>> and that worked as well.
>>
>> So, it seems that there is a bug in parsing opensrf_core.xml when  
>> only a single service is listed for a router.  That service does  
>> not get attached properly.  Adding another service allows the first  
>> service to work, but it is unknown if the second service is  
>> affected (that is, it may be the case that the last service listed  
>> is failing, though I somehow doubt that).
>>
>> Victoria, try doubling the service line as I have, restart the  
>> stack, and see if the public math interface works for you.
>>
>> Good luck,
>> DW
>>
>>
>>>>> Victoria Bush <vbush at ilstu.edu> 6/16/2009 10:59 AM >>>
>>
>> On Jun 15, 2009, at 5:06 PM, Dan Wells wrote:
>>
>>> Hello Victoria,
>>>
>>> I kinda hate to suggest this, since I thought this issue was fixed,
>>> but have you tried starting the components separately?  That is:
>>>
>>> osrf_ctl.sh -l -a stop_all
>>>
>>> (wait a few minutes, kill as needed)
>>>
>>> osrf_ctl.sh -l -a start_router
>>>
>>> (wait for activity to stop/slow)
>>>
>>> osrf_ctl.sh -l -a start_perl
>>>
>>> (wait for activity to stop/slow)
>>>
>>> osrf_ctl.sh -l -a start_c
>>>
>>> Since many pieces in this system operate independently, starting
>>> them in a more controlled fashion has been suggested in the past for
>>> quirky race-condition-type problems.
>>>
>>> Good luck,
>>> DW
>>>
>>>
>>
>> Okay, several things are causing confusion on my part because the
>> documented behavior at
>> http://evergreen-ils.org/dokuwiki/doku.php?id=troubleshooting:checking_for_errors 
>>
>> is not what I'm seeing. First of all, the default opensrf_core.xml
>> example file that I used to create my file did not create separate  
>> log
>> files, private.router.log and public.router.log. (Of course, on the
>> troubleshooting page two paragraphs above this mention of two router
>> log files, it mentions the single router.log that is instead  
>> created.)
>> So I changed my xml file to create two separate log files.
>>
>> In addition, the opensrf_core.xml example file claims that a log file
>> gateway.log will be created, but I've never seen one. What seems to  
>> be
>> happening is that the private.localhost router comes up fine, but the
>> public.localhost one doesn't--or if it does come up, it's in some
>> weird state that doesn't do anything.
>>
>> Retracing my steps:
>>
>> 1. I stopped everything and killed all leftover processes. I also
>> moved all the log files into a subdirectory to hide them for now.
>> 2. I *only* started the router:
>> 	osrf_ctl.sh -l -a start_router
>> 3. The *only* log file that is created is private.router.log:
>>
>>> router 2009-06-16 09:25:26 [INFO:30364:osrf_router_main.c:95:]
>>> Router connecting as: server: private.localhost port: 5222 user:
>>> router resource: router
>>> router 2009-06-16 09:25:26 [INFO:30364:osrf_router_main.c:117:]
>>> Router adding trusted server: private.localhost
>>> router 2009-06-16 09:25:26 [INFO:30364:osrf_router_main.c:129:]
>>> Router adding trusted client: private.localhost
>>>
>>
>> I see no file called public.router.log, but there are two processes
>> running:
>>
>> $ ps -eaf | grep OpenSRF
>> opensrf  30368     1  0 09:25 ?        00:00:00 OpenSRF Router
>> opensrf  30369     1  0 09:25 ?        00:00:00 OpenSRF Router
>> opensrf  30385 29763  0 09:28 pts/1    00:00:00 grep OpenSRF
>>
>> 4. If I stop the router now:
>> 	osrf.ctl.sh -l -a stop_router
>>
>> NOW I see a public.router.log file, and it says:
>>
>>> router 2009-06-16 09:51:55 [WARN:30368:osrf_router_main.c:11:]
>>> Received signal [2], cleaning up...
>>>
>>
>> So while the public router comes up, something's not right. But I  
>> have
>> no idea how to diagnose this further.
>>
>> The only changes in my opensrf_core.xml file since I last posted it
>> was to change the names of the log files, as indicated above. So the
>> differences between this core file and the example one included in
>> OpenSRF 1.0.6 are just the passwords and the log files.
>>
>>> $ diff opensrf_core.xml.example opensrf_core.xml
>>> 38c38
>>> <     <passwd>password</passwd>
>>> ---
>>>>   <passwd>*****</passwd>
>>> 104c104
>>> <     <passwd>password</passwd>
>>> ---
>>>>   <passwd>*****</passwd>
>>> 128c128
>>> <                 <password>password</password>
>>> ---
>>>>               <password>*****</password>
>>> 133c133
>>> <             <logfile>/openils/var/log/router.log</logfile>
>>> ---
>>>>           <logfile>/openils/var/log/public.router.log</logfile>
>>> 150c150
>>> <                 <password>password</password>
>>> ---
>>>>               <password>*****</password>
>>> 155c155
>>> <             <logfile>/openils/var/log/router.log</logfile>
>>> ---
>>>>           <logfile>/openils/var/log/private.router.log</logfile>
>>
>>
>> Here's my slightly updated opensrf_core.xml file.
>>
>>
>>> <?xml version="1.0"?>
>>> <!--
>>> vim:et:ts=2:sw=2:
>>> -->
>>> <config>
>>>
>>> <!-- bootstrap config for OpenSRF apps -->
>>> <opensrf>
>>>
>>>   <routers>
>>>
>>>     <!-- define the list of routers our services will register
>>> with -->
>>>
>>>     <router>
>>>
>>>       <!-- This is the public router.  On this router, we only
>>> register applications
>>>            which should be accessible to everyone on the opensrf
>>> network -->
>>>       <name>router</name>
>>>       <domain>public.localhost</domain>
>>>       <services>
>>>           <service>opensrf.math</service>
>>>       </services>
>>>     </router>
>>>
>>>     <router>
>>>       <!-- This is the private router.  All applications must
>>> register with
>>>           this router, so no explicit <services> section is
>>> required -->
>>>       <name>router</name>
>>>       <domain>private.localhost</domain>
>>>     </router>
>>>   </routers>
>>>
>>>
>>>   <!-- Jabber login settings
>>>       Our domain should match that of the private router -->
>>>   <domain>private.localhost</domain>
>>>   <username>opensrf</username>
>>>   <passwd>privctltsrf</passwd>
>>>   <port>5222</port>
>>>   <!-- name of the router used on our private domain.
>>>       this should match one of the <name> of the private router
>>> above -->
>>>   <router_name>router</router_name>
>>>
>>>   <!-- log file settings ======================================  -->
>>>   <!-- log to a local file -->
>>>   <logfile>/openils/var/log/osrfsys.log</logfile>
>>>
>>>   <!-- Log to syslog. You can use this same layout for
>>>       defining the logging of all services in this file -->
>>>   <!--
>>>   <logfile>syslog</logfile>
>>>   <syslog>local2</syslog>
>>>   <actlog>local1</actlog>
>>>   -->
>>>
>>>   <!-- 0 None, 1 Error, 2 Warning, 3 Info, 4 debug, 5 Internal
>>> (Nasty) -->
>>>   <loglevel>3</loglevel>
>>>
>>>   <!-- config file for the services -->
>>>   <settings_config>/openils/conf/opensrf.xml</settings_config>
>>>
>>> </opensrf>
>>>
>>> <!-- Update this if you use ChopChop -->
>>> <chopchop>
>>>   <!-- Our jabber server -->
>>>   <domain>private.localhost</domain>
>>>   <port>5222</port>
>>>   <!-- used when multiple servers need to communicate -->
>>>   <s2sport>5269</s2sport>
>>>   <secret>secret</secret>
>>>   <listen_address>10.0.0.3</listen_address>
>>>   <loglevel>3</loglevel>
>>>   <logfile>/openils/var/log/osrfsys.log</logfile>
>>> </chopchop>
>>>
>>> <!-- The section between <gateway>...</gateway> is a standard
>>> OpenSRF C stack config file -->
>>> <gateway>
>>>
>>>   <!--
>>>   we consider ourselves to be the "originating" client for requests,
>>>   which means we define the log XID string for log traces
>>>   -->
>>>   <client>true</client>
>>>
>>>   <!--  the routers's name on the network -->
>>>   <router_name>router</router_name>
>>>
>>>   <!--
>>>   These are the services that the gateway will serve.
>>>   Any other requests will receive an HTTP_NOT_FOUND (404)
>>>   DO NOT put any services here that you don't want the internet to
>>> have access to
>>>   This section will be soon deprecated for multi-domain mode...
>>>   -->
>>>   <services>
>>>     <service>opensrf.math</service>
>>>   </services>
>>>
>>>   <!-- jabber login info -->
>>>
>>>   <!-- The gateway connects to the public domain -->
>>>   <domain>public.localhost</domain>
>>>   <username>opensrf</username>
>>>   <passwd>pubctltsrf</passwd>
>>>   <port>5222</port>
>>>   <logfile>/openils/var/log/gateway.log</logfile>
>>>   <loglevel>3</loglevel>
>>>
>>> </gateway>
>>>
>>> <!--
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> ====================================================================
>>> -->
>>>
>>>   <routers>
>>>       <router> <!-- public router -->
>>>           <trusted_domains>
>>>               <!-- allow private services to register with this
>>> router
>>>                    and public clients to send requests to this
>>> router. -->
>>>               <server>private.localhost</server>
>>>               <!-- also allow private clients to send to the
>>> router so it can receive error messages -->
>>>               <client>private.localhost</client>
>>>               <client>public.localhost</client>
>>>           </trusted_domains>
>>>           <transport>
>>>               <server>public.localhost</server>
>>>               <port>5222</port>
>>>               <unixpath>/openils/var/sock/unix_sock</unixpath>
>>>               <username>router</username>
>>>               <password>pubctltroute</password>
>>>               <resource>router</resource>
>>>               <connect_timeout>10</connect_timeout>
>>>               <max_reconnect_attempts>5</max_reconnect_attempts>
>>>           </transport>
>>>           <logfile>/openils/var/log/public.router.log</logfile>
>>>           <!--
>>>           <logfile>syslog</logfile>
>>>           <syslog>local2</syslog>
>>>           -->
>>>           <loglevel>2</loglevel>
>>>       </router>
>>>       <router> <!-- private router -->
>>>           <trusted_domains>
>>>               <server>private.localhost</server>
>>>               <!-- only clients on the private domain can send
>>> requests to this router -->
>>>               <client>private.localhost</client>
>>>           </trusted_domains>
>>>           <transport>
>>>               <server>private.localhost</server>
>>>               <port>5222</port>
>>>               <username>router</username>
>>>               <password>privctltroute</password>
>>>               <resource>router</resource>
>>>               <connect_timeout>10</connect_timeout>
>>>               <max_reconnect_attempts>5</max_reconnect_attempts>
>>>           </transport>
>>>           <logfile>/openils/var/log/private.router.log</logfile>
>>>           <!--
>>>           <logfile>syslog</logfile>
>>>           <syslog>local2</syslog>
>>>           -->
>>>           <loglevel>4</loglevel>
>>>       </router>
>>>   </routers>
>>>
>>> <!--
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> ====================================================================
>>> -->
>>>
>>> </config>
>>>
>>
>>
>>
>>
>> --
>> Victoria Bush
>> Opscan Evaluation Manager
>> Center for Teaching, Learning & Technology
>> vbush at ilstu.edu 
>>
>>
>>
>
> --
> Victoria Bush
> Opscan Evaluation Manager
> Center for Teaching, Learning & Technology
> vbush at ilstu.edu 
>
>
>

--
Victoria Bush
Opscan Evaluation Manager
Center for Teaching, Learning & Technology
vbush at ilstu.edu 





More information about the Open-ils-dev mailing list