LD_LIBRARY_PATH= /home/ipl/openmpi-1.3.3/platforms/hp/libįurthermore, I can say that I have not specified any MCA parameters. PATH=/home/ipl/openmpi-1.3.3/platforms/hp/bin The PATH and LD_LIBRARY_PATH, hydra11 and hydra12: The application (named flow) was launched on hydra11 by
Open mpi was configured with prefix and with the path to openib, and with the following compiler flags Below is the information required according to the openmpi web, I would really appreciate comments on this.
#Startup failed 0xd9 archive#
I have consulted the archive of open-mpi and have found many error messages of the same kind, but none from the 1.3.3 version, and none of direct relevance. However, it seems that the crash especially occurs when I run on more than 1 node. The crash does not appear always - sometimes the application runs fine. Mpirun noticed that process rank 6 with PID 9312 on node hydra11 exited on signal 11 (Segmentation fault). flow(SendRealsAlongFaceWithOffset_3D+0x4ab) Occasionally (not always) I get a crash with the following message: Do you feel lucky?Īnyway, in summary: Rather than call the address of the vector table directly, you need to load the second word from it, then call whatever address that contains.I am using openmpi 1.3.3 with OFED on a HP cluster with redhatLinux. The lower byte of 0x7bxx is where the base register is encoded, so by varying the address you have a crapshoot as to which register that is, and furthermore whether whatever junk value is left in there also happens to be a valid address to load from. However, the killer is that initial stack pointer, because thanks to the RAM being higher up, you get this: 0: 7b38 ldrb r0, Until you eventually bumble through all the remaining NOPs to 0x20d8 where you pick up the real entry point. I cannot understand the instruction DCW and why it goes to hard fault.Ĭan anyone tell me the reason behind this?Įxecuting the vector table is what you do on older ARM7/ARM9 parts (or bigger Cortex-A ones) where the vectors are instructions, and the first entry will be a jump to the reset handler, but on Cortex-M, the vector table is pure data - the first entry is your initial stack pointer, and the second entry is the address of the reset handler - so trying to execute it is liable to go horribly wrong.Īs it happens, in this case you can actually get away with executing most of that vector table by sheer chance, because the memory layout leads to each halfword of the flash addresses becoming fairly innocuous instructions: 2: 1000 asrs r0, r0, #32 But in the earlier case it doesn't go to hard fault. When stepping from address 0x0000 2000 ,it goes to hard fault handler. The disassembly is as follows in this case.
#Startup failed 0xd9 code#
This value is not divisible by 64 and the code is stuck. Then stack base + stack size(0x2900) gives a value = 0x10007B38.
This value is divisible by 64 and the code is running without stuck.īut ,when I select heap memory as 0x00002E88 ,it generates stack base as 0x10005238. I found this is loaded at first locations of application bin file.
Then stack base + stack size(0x2900) gives a value = 0x10007B40. When I select heap memory as 0x00002E90 ,it generates stack base as 0x10005240. After some deep debugging ,I could find out that machine was not stuck when there is a value which is divisible by 64 is at first locations of application bin file. After adding some heap memory to the code, the machine is stuck. User_code_entry = (void (*)(void))((USER_FLASH_START)+1) NVIC_SetVectorTable(NVIC_VectTab_FLASH, USER_FLASH_START) In case the user application uses interrupts */ * Change the Vector Table to the USER_FLASH_START NVIC_VECT_TABLE = NVIC_VectTab | (Offset & 0x1FFFFF80) Void NVIC_SetVectorTable(DWORD NVIC_VectTab, DWORD Offset) I am working on lpc 1768 SBL which includes the following code to jump to user application.