Bugs fixed in SGE 6.0u12 since release 6.0u11 --------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 376 4743006 problem with floating point job resource limits 2189 6525375 qacct ignores jobs in output 2249 6568575 SGE does not work if primary group entry is too big in groups map 2265 6280747 qmon loses sharetree changes 2270 6575720 ENABLE_ADDGRP_KILL is missing from sge_conf(5) 2272 6575727 sge_shadowd(8) man page is missing some env vars 2276 6575731 share_tree(5) doesn't explain type field 2289 6565951 Qmon panel does not check for valid data in Scheduler Configuration 2290 6566033 Help for Browser panel in qmon incomplete 2299 6536039 sgeremoterun not working 2307 6433628 qconf -sq all.q@myhost produces no value at all for complex_values (not even NONE) 2312 6482211 complex attributes whose deletion is denied don't reflect back after the denial message in qmon 2313 6410592 Double clicking in Consumables/Fixed Attributes list does not behave as a GUI should 2314 6513116 Qmon x qconf inconsistent in allowed characters in attribute names 2320 6513115 in qmon, under calendar configuration, it is possible to modify even if no calendar exists 2323 6576153 Creating a userset with NONE as a type results in a core dump 2324 6576197 Userset type should not accept empty string value 2325 6542987 drmaa_run_job(3) raises error if drmaa_native_specification has leading spaces 2344 6590079 Resource reservation broken with sequences of identical jobs differing only in their -R y|n 2346 6604155 qmon binary job submit is broken 2356 6600619 Userset spooling in classic mode is broken 2374 6589459 Expose the availability of keyword "none" in the manual page of calendar_conf 2375 6610788 qdel returns wrong exit code 2394 6608259 scheduler prints empty line in messages file after every 'sge_mirror' logging 2401 6617450 add option to reporting_params for switching off writing of consumables 2404 6618328 qmon displays wrong string for queue filtering 2409 6619016 removing parameters from the reporting_params will not fallback to the default 2414 6618599 Long running jobs cause incorrect usage summary for ARCo database 2417 6622842 the start_time field in intermediate accounting records is incorrect 2419 6391244 qstat -ext reports wrong usage as compared to other commands such as qstat -t or qstat -j 2424 6620253 During the installation the admin user should create web.xml file - 6541085 NFS write error on N1GE trace file - 6195248 QMON Job Control Window: Incomprehensible Priority Button Bugs fixed in SGE 6.0u11 since release 6.0u10 -------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 752 6288953 scalability issue with qdel and very large array jobs - 6233523 loadcheck reports on a hyperthreaded CPU only one processor 2188 6421113 CSP mode auto installation: certificates are not copied to submit hosts 2243 6555744 qmon crashes when displaying about dialog 2275 6564503 sge_schedd deadlock upon schedd_job_info job_list being enabled 2262 4742097 Qmon has a ticket number limitation 1729 4818801 qmon on secondary screen crashes when "Job Control" is pressed 2258 5081743 queue status in reporting file is missing. 747 6291044 "Modify"-Button is activated but should be grayed - 6317563 reporting(5) man page lacks information about sharelog records 2260 6327539 Ability to sort queue instances using each column of the queue instances table 1813 6328064 Queue request -q from sge_request can't be overridden through command line 1860 6345522 qdel on a job in deleted state does not output any information 916 6355875 qsub -terse to just output job id 1948 6364661 qrsh man page doesn't explain which options don't work with interactive jobs 804 6367642 Numbers in error mail too large 2050 6422335 still used usersets/project/calendar/pe/checkpoint can be removed under certain conditions - 6426331 remove util/sge_log_tee from distribution 2058 6428495 shell_start_mode should be documentated to be only used for batch jobs - 6447133 reserved usage not explained in sge_conf.5 - 6470048 Discrepancy between load values reported by Gridengine and from the HP-UX 64 bit env. 2196 6472614 auto installation option failed to save the install log - 6476263 function job_get_id_string() is not MT save and used in qmaster 2061 6494390 Broken output of job name with 'qsub -N' 2183 6499217 meaningless error in clients when reporting_param flush_time is incorrectly set 2182 6513433 remote installation of execd's need enhancement, rework, cleanup 2171 6516288 Scheduler does not write pid file in daemonize phase 2178 6518607 invalid memory access in cl_com_get_handle 2180 6518684 Qconf usage x man page inconsistency 2181 6518689 Project man page contains different attribute names. 2221 6521802 the binary check in inst_sge is wrong! - 6522273 Wrong exit code with qconf -sds 2192 6525917 qacct -l h= dumps core on darwin and linux itanium 2219 6536426 inst_sge -m fails for non-root when USER variable is not set - 6537633 Extraneous space in qsub's "Invalid month specification." message 2222 6538293 Hybrid user/project share-tree is broken for user sharing amongst array jobs 2261 6538740 clear usage operation should implicitely trigger refresh in share-tree dialogue 2229 6544869 UNKNOWN group/owner in accouting(5) 2263 6553066 qmon's Complex Configuration Load and Save buttons did not work 2187 6562190 memory leak in sge_schedd Bugs fixed in SGE 6.0u10 since release 6.0u9 -------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 126 6287963 qdel of just submitted job 1635 6291036 can't start qmaster message appears, but qmaster is started 1920 6332587 inst_sge script does not add the master host to the shadow_masters file 1914 6349814 wrong qlogin_daemon or rlogin_daemon in host conf doesn't set host and job into error state 1904 6353526 reprioritize field in qmon cluster config missing 1905 6353558 hostname resolving should not be case sensitive 1931 6359575 drmaa_version() function should return 1.0 1740 6365045 DRMAA sessions should be persistent 1993 6391930 drmaa_control() causes illegal memory access 1994 6391947 getDrmaaImplementation() should return the same string as getDrmSystem() 2023 6407525 qconf rejects configuration, when attribute value ends with a space character 2051 6412865 during QMaster installation, creation of local database directory fails on hp11 2153 6421696 the execd auto_install takes too long because of long delays after a parallel install block 2080 6422610 Unable to modify Advanced Settings in Configuration for Host in my cluster using qmon 2127 6425823 qacct -l h= dump core 2059 6429305 shared library name DT_SONAME not set with libdrmaa.so 2133 6437853 Berkeley DB backup failed when using hostname with a fully qualified domain name 2134 6438084 the inst_common.sh is missing $SGE_EXECD_PORT - 6438475 potential security issues in cull library 2132 6444537 inst_sge -help wrongly indicates -bup/-rst works with BDB spooling only 2081 6445469 qping segfaults in ssl mode 2167 6448704 The sge_share_mon utility does not work with the automatic policy enforcement 2095 6453427 the auto uninstall execd needs a ssh deamon when the uninstall is done local 2092 6457900 accounting records for slave tasks of pe jobs contain invalid submission time - 6474739 DRMAA 1.0 interface need complete documentation in man pages 2145 6475272 qselect matches wrong resources which have been overridden at lower level 2128 6482762 qsh does not work if XAUTHORITY is set in root environment 2106 6483941 In certain cases jobs may stay in "t" state for 5 minutes 2108 6486125 qmaster logging "scheduler tried to remove a incomplete" 2144 6488244 ignore_fqdn is broken for the local configuration 2080 6494508 host already exists when modifying cluster settings 2122 6497217 segmentation fault with empty string 2125 6497575 qmaster performance gets throttled if qsub -sync y is used when many jobs are in the system 2129 6500359 sge_conf(5) setting 'max_u_jobs' broken if BDB spooling is used 571 6506637 job control: sorting by different fields 721 6506677 qmon job control: display wider default columns 2151 6507576 load formula does not recognize float as weighting factor 2158 6509684 qmaster dies when modifying slots value for queue domain when queuename is missing 2157 6511171 spooledit cannot dump USERSET objects 2169 6513226 default xterm path in arch_variables script not correct for darwin architectures 2169 6513233 qsh problems on darwin architecture because of wrong crypto lib 2172 6517015 execution daemon can crash on Linux where libnss_ldap.so uses BDB 4.2 shared library Bugs fixed in SGE 6.0u9 since release 6.0u8 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ - 6480580 CSP mode is affected by OpenSSL Security Advisory [28th September 2006] 2107 6475282 account string does not accept the "|" character 2094 6458517 unreasonably long scheduler dispatch times if lots of projects are used in share tree 2093 6458510 unreasonably long scheduler dispatch times if lots of cluster queues are deployed in large clusters 2055 6424565 jobs with negative priority will be rejected by qmaster Bugs fixed in SGE 6.0u8 since release 6.0u7 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 368 4737342 interactive jobs leave behind output/error files if prolog/epilog are run 1064 5063318 install: screen not cleared 1934 5109725 Edits to settings.csh and settings.sh 1639 6287945 Interrupting qrsh while pending does not remove job 1550 6291033 Unclear share caclulation of running jobs 1741 6319223 subordinate properties lost on qmaster restart 1945 6363823 qsub -w w changes -sync behavior 1947 6364440 qconf -mhgrp results in glibc error message and abort 1950 6365380 possible buffer overflow in sge_exec_job() - 6366691 utilbin//rsh can be used to gain root access 1956 6368747 Job tickets are not correctly shown in qstat for none running jobs 1955 6368942 qselect man page refers to qconf -mqattr 1985 6380207 RPC Berkeley DB install failed due to FQDN hostname 1977 6383513 resource filtering in qselect broken 1972 6384682 "qstat -j" aborts 1980 6384698 schedulers mem use growing, if pe jobs are running 1981 6384709 slow scheduler performance for jobs with hard queue requests 1957 6384812 qstat produces non-well-formed XML output - 6387206 CSP revocation lists are not supported 1986 6387371 The parallel automatic install may overload machines 1990 6389526 commlib closes wrong connection on SSL error 1998 6390494 qrsh issue with interactive jobs and directory write permissions 1997 6391238 qrsh does not accept -o/-e/-j 1999 6396851 sched_conf manual page contains CVS markup 2007 6397383 qmaster deadlock when reporting file cannot be written - 6397987 several buffer overruns - 6398008 Off-by-one overrun in communication library 2003 6398723 Tickets are not reset for running jobs after disabling the ticket policy - 6400729 weak authentication and authorization in CSP mode 2010 6401993 qstat -u crashes 2017 6405794 qstat.xsd is missing the cqueue_summary_t/load element 2021 6407513 Scheduler hangs after a qmaster crash and restart 2022 6407523 scheduler tuning during installation is broken 2026 6408109 CSP installation with admin user = root broken 2027 6408248 qmon crashes on lx24-ia64 2028 6411230 Job Sequence Number got screwed up when restarting qmaster daemon 2029 6411660 The man page QUEUE_CONF(5) does not describe the memory specifier 'G' Bugs fixed in SGE 6.0u7 since release 6.0u6 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 1922 4780562 xterm flags -e and -ls do not work with qsh and this should be documented 1921 4919544 outdated/incomplete documentation on qalter and qmod 1149 5056331 autoinstall hangs if root owns the files on $SGE_ROOT 1248 5090187 Install scripts fails in adding >sge_request< options file with $ADMINUSER 1842 6207868 wording with qconf -cq should be changed 1330 6239653 auto installation doesn't provide sufficient diagnosis output 1451 6239658 inst_sge -ux -host might be incomplete if not run from an admin host 1490 6242169 Multi-threaded, multi-CPU username problems 1750 6250692 accounting(5) record can't be made available immediately after job finish 1823 6252471 sge qmaster startup and shutdown (non critical) error message as non root 1556 6253860 First character is lost in quoting 1803 6255111 Binary jobs are problematic for starter and epilog scripts 1780 6256590 qconf -mq disallows 2057 hostspecific profiles in slots configuration 1801 6268799 confusing execd startup messages and delays in case of problems 1626 6275789 soft requirements on load values are ignored - 6279523 qlogin on windows does not work! 1661 6282996 use of IP address as host name disables unique hostname resolving 1665 6286510 delivery of queue based signals to execd repeated endlessly 1578 6287828 shadow master uninstallation cannot manage shadow_masters containing several lines 814 6287847 qstat -j shows wrong message for parallel jobs which can't be dispatched 1028 6287850 Allow SIGTRAP to enable debugging 1265 6287860 effect of -p priority and weight_priority not described in sge_priority(5) 1306 6287862 qhost -l for complexes is broken 1363 6287865 qrsh default job names are not consistent with documented job name limitations 1527 6287910 $pe_hostfile has 4 entries, man page says 3 1619 6287935 qmod -sq can kill a pe job in t state 1631 6287940 Job error state is not documented in the qstat man page 1640 6287946 qconf -[dm]attr gets confused by shortcuts 1652 6287953 repeated logging of the error message: "failed building category string for job N" 1655 6287955 strange reservation 1695 6288626 default PATH variable set for job insufficient for non-login shell jobs 1249 6289240 Install will fail if non-root ADMINUSER selected and they don't own $SGE_ROOT 1731 6289455 qstat -XML output does not match the schema 1378 6291016 qmon startup and queue add/modify warning messages 1475 6291023 qstat -j doesn't print delimiter between jobs 1679 6292742 tight integration - qrsh_exit_code file not written 1680 6292751 admin mail information is incorrect 1798 6292926 qconf -mattr can crash qmaster 1752 6293411 NFS write error on host : Permission denied. 1691 6294052 suspend threshold is not working for calendar disabled queues 1802 6294875 CSP: consolidate error output if cert CA on client and server don't match 1650 6295231 Java language binding email property doesn't work 1651 6295233 JobTemplate property getters throw InvalidAttributeException 1720 6295791 qacct -h should not resolve hostnames 1724 6299982 Slow submission rate with drmaa_run_job() 1800 6301047 qstat -s p doesn't show pending array tasks while there are tasks of this job running 1727 6303671 DRMAA can abort in the middle of a session if NIS becomes unavailable 1715 6304466 qmaster crashes with large number of qconf -aattr calls 1687 6304471 qlogin -R does not work like documented 1732 6304490 qconf -as/-ah leads to segmentation fault 1733 6305095 qstat schema files are incomplete 1738 6306229 wrong soft requests decision 1742 6306834 consumables as thresholds are not working correctly with pe jobs 1739 6307557 qhost returns wrong total_memory value on MacOSX 10.3 1861 6310168 autoinstall does not support csp installation mode 1758 6313445 Qrsh tries to free invalid pointer - 6314019 qloadsensor.exe uses up more and more handles 1924 6314301 -hold_jid option in the man page does not correctly reflect reality. 1819 6314306 using "-bup" with "-auto" breaks with later update release 1761 6315111 doing a qalter -l rsc=val on running jobs breaks consumable debit 1767 6316995 qconf -mp prints error messages two times 1768 6317028 Quotes in job category can result in memory corruption 1763 6317048 Memory leaks in drmaa library, japi_wait and drmaa_job2sge_job 1772 6318018 shepherd doesn't handle qrlogin/qrsh jobs correctly 1778 6318659 sge_ca -usercert fails when executed more than once 1773 6318660 the system hold on an array task can vanish 1749 6319228 Backslash line continuation is broken for host groups 1760 6319231 unable to delete a configuration of a non existing host 1762 6319233 Parsing of context variable options fails for values containing commas in single quotes 1770 6320683 Binary switch reversed in job category and can cause application to hang 1820 6320869 sge_qmaster daemon is running on both the master and shadow nodes after a long network failure 1787 6322498 calendar syntax "week mon=0-21" corrupts SGE and may crash qmaster 1923 6325359 comments in sge_request file refer to cod_request(5) manpage but should say sge_request(5) 1810 6327427 qping core dump with enabled message content dump 1814 6328703 fstype does not recognize nfs4 share in all cases 1847 6329832 qconf and qmaster accept invalid settings for queue complex_values - 6331433 gemm install hangs on Fedore Core 1 1821 6332876 qstat -U does not consider queue access for job and project access for queues 1822 6332877 qstat -pe filter does not work 1826 6333407 configuring the halflife_decay_list crashes the qmaster 1828 6333467 sgemaster -migrate may not delete qmaster lock file and may break shadowd functionality 1838 6336519 changing the cwd flag in qmon - qalter has no effect 1856 6338314 occasional "failed to deliver job" errors due to SIGPIPE in sge_execd 1837 6339756 Quotes in qtask file can result in memory corruption 1848 6342005 a scheduler configuration change with a sharetree can result in a usage leak 1866 6346696 connection to Berkeley DB RPC server can timeout 1845 6346704 qrsh -V doesn't always work. 1869 6347267 sge_ca script fails when no /dev/random present due to permission problem - 6347351 sgeCA may not be consistent after new installation 1872 6347840 -mhgrp switch is missing in qconf man page 1874 6348299 qconf -mstree aborts 1876 6348516 job finish does not terminate all processes of a job 1877 6348517 job finish although terminate method is still running 1895 6349351 complex man page describes regex incorrectly 1890 6349768 Upgrade from 5.3 to 6.0 fails with an empty complex in the 5.3 cluster 1891 6349818 an additional started schedd/execd daemons may not stop if started when qmaster is down 1892 6349972 DRMAA crashes during some operations on bulk jobs 1782 6350714 qconf -purge option not fully explained in help output 1783 6351174 qconf -purge queue slots all.q" doesn't behave as expected 1897 6351240 -rsstree is missing from qconf man page 1898 6351278 qconf man page options are out of order 1900 6351728 installation of qmaster failed when using /etc/services 1904 6353526 reprioritize field in qmon cluster config missing 1925 6353555 qresub manpage inprecise and partly incorrect 1882 6354143 mutually subordinating queues suspend each other simultaneously 1912 6354164 drmaa does not work on hp11 platform 1913 6354236 auto install ignores DB_SPOOLING_SERVER setting 1916 6355263 reschedule of a parallel job crashes the qmaster 1783 - "qconf -purge queue slots all.q" doesn't behave as expected 1870 - Manpages qhold/qrls refer to old -uall option Bugs fixed in SGE 6.0u6 since release 6.0u5 Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 1692 6294118 no newline for "qstat -f -explain A" 1718 6298056 INHERIT_ENV and SET_LIB_PATH are not reset by setting execd_params to NONE - 6298233 no user notification or command hanging if an immediate job cannot be scheduled - 6299345 No error messages in case SSL initialization failes - 6299351 qrsh fails when execd_param INHERIT_ENV=false and no ARC set in sge_execd environment - 6299939 distribution should contain all Berkeley DB utilities - 6299943 distribution should contain documentation for the Berkeley DB utilities Bugs fixed in SGE 6.0u5 since release 6.0u4 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 403 4769608 qalter shows wrong priority number when using negative priorities with -p option 1084 5063313 no links for SGE startup scripts for shutdown created 1420 6218877 qstat -t is broken 1108 6245812 qmon failed to find SGE shared library due to user-defined LD_LIBRARY_PATH_64 1541 6250603 qmon crash (segmentation fault) on Solaris64 1547 6252469 missleading qstat -j messages in case of resource reservation 1625 6252525 qmon: complex attributes not removeable 1583 6260656 incomplete resource reservation with array jobs 1591 6262009 backup script does not backup sgeCA directory for CSP systems 1596 6263509 autoinstall fails, trying to install a execd on masterhost 1595 6264592 drmaa_control(DRMAA_JOB_IDS_SESSION_ALL, DRMAA_CONTROL_SUSPEND|RESUME) returns INVALID_JOB error 1597 6265154 Wildcards in PE Name Cause Unusual Behavior 1623 6266392 Performance problem with qconf -mattr exechost XX XX global 1624 6266450 performace bottleneck with subordinate list 1632 6267238 Multithreaded DRMAA may crash due to use of sge_strtok() 1598 6267245 Repeated logging of the same message produces giant logging files 1612 6267932 high CPU load of qmaster even on empty cluster 1620 6268707 job_load_adjustements is not correctly working when parallel jobs are submitted. 1621 6269305 qrsh/qsh/qlogin reject -js option 1654 6269411 Close integration cause jobscripts with multiple mprun commands to be killed. 1627 6272451 execd auto_install performance bottleneck 1610 6273006 qstat -j "" results in a segmentation fault 1657 6273217 race condition with qsub -sync and drmaa_wait() if job exits directly after being submitted 1446 6274467 qmon kills a system 1669 6277874 N1GE6U4 installation on Red Hat creates wrong rc*.d script names, such as /etc/rc3.d/S-1sgeexecd 1642 6277909 qconf -mq coredumps 1646 6278140 inst_sge -sm don't install a startup script 1647 6278146 inst_sge -db error on MacOS 1648 6278147 drmaa_job_ps() returns DRMAA_PS_QUEUED_ACTIVE for finished array job rather than DRMAA_PS_DONE 1656 6278727 qstat -xml -urg output contains badly formatted numbers 1659 6279402 drmaa_exit() causes qmaster error logging if host is no admin host 1616 6279409 qconf -tsm command generates too much data (very large schedd_runlog file) 1531 6280698 Resource filtering with qhost broken 1658 6281440 resource allocation shown by qstat/qhost not consistent with resource utilization 1601 6281462 qmaster profiling can only be turned on by restarting qmaster 1662 6283308 overhead with job execution could lead to overoptimistic backfilling and break resource reservation 1667 6285898 qconf -Xattr does not resolve fqdn hostnames 1665 6286510 delivery of queue based signals to execd repeated endlessly 1666 6286533 job wallclock monitoring and enforcement considers prolog/epilog runtime part of net job runtime 1481 6287824 Asking to have RC scripts removed INSTALLS on SUSE 1617 6287831 Bad check for jobs when removing execution hosts 1410 6287867 tight integration: temporary files are not deleted at task exit 1670 6287917 "dl.sh 0" doesn't unset SGE_ND 1652 6287953 getting many E messages "failed building category string for job N" 1671 6287958 suspend not working under Mac OS X 1673 6288156 sge_shepherd SEGV's when it tries to fopen the usage file 1674 6288588 jobs submitted with -v PATH do not retain $TMPDIR prefixed by N1GE as required for tight integration 1694 6294397 wrong drmaa jnilib link on MacOS - 6294915 Document installation if domain users intend to use N1GE - 6294979 update format specifier and command line options in qping(1) man page - 6294980 add man page for sgepasswd command - 6294982 document DURATION_OFFSET parameter - 6294987 document ENABLE_WINDOMACC parameter if Windows domain user accounts should be used 1705 6295165 finished array job tasks can be rescheduled if master/scheduler daemons are stopped/started 1569 - SGE install of libdb-4.2so conflicts with Fedora Core 3 version Bugs fixed in SGE 6.0u4 since release 6.0u3 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ - 4760393 SGE installation on a host with IP Multipathing has to be documented. - 4760401 SGE should install properly on hosts with IPMP activated - 4768907 Insufficent instruction for GridEngine install as a normal user 497 4820420 sge_shadowd(8) man page should be improved - 4876872 Doc section on the shadow/failover configuration is incomplete - 4975432 Installation Guide: Missing step in secure install - 5048312 Incorrect trademarks on N1 Grid Engine 6 Installation Guide 1119 5071527 Error messages with autoinstallation - 5079032 Figure 3-5 and its descriptive text do not match and docfeedback@sun.com mentioning 1220 5085004 qstat -f -q all.q@HOSTNAME does not resolve hostname 1535 5086193 load.sh fails on a machine when uptime displays time for less than an hour - 5097424 ARCO install documentation should be improved - 5104922 ARCO install instructions has minor errors 1506 6178843 qconf changes to complex doesn't display all the changes made upon exit - 6186597 qconf error diagnosis broken - 6193945 qstat options -urg/-pri/-explain not covered in admin/users guide 1334 6194719 starter_method is ignored with binary jobs that are started without a shell - 6196556 Need samples in Admin guide to illustrate N1GE6 policy capabilities 1493 6197109 install_execd does not pick up $SGE_CELL 1347 6197730 Problems with shadowd install 1359 6199256 qconf -[a|A|m|M]stree kills qmaster 1384 6203977 execd installation fails, if local spool dir is not entered by user! 1385 6203984 Port free/used check returns a wrong result in some cases! 1256 6205060 SGE tools segfault when gid can't be looked up - 6205729 Wrong constraint about spooling directory location in Install Guide - 6208982 database model of reporting database missing in admin guide 1332 6209487 install guide: qmon under JDS needs correct Motif runtime libraries be installed 1519 6215730 qdel failed to delete qrsh (login) job on a Solaris box when Secure Shell is used 1418 6218379 Problems with BDB RPC server are hard to diagnose 1403 6218430 Problems with load values if execution daemons run in a solaris zone at x86 1420 6218877 qstat -t is broken 1422 6219517 qsub -sync y doesn't remove session directories 103 6219999 changing of local execd_spool_dir is fault prone - 6220019 Administrator guide lacks documentation about certificate renewal 1427 6220060 wrong calendar settings kills the qmaster 1416 6221167 sge_schedd segfaults in case of a restart and a running pe job. 1433 6221231 qsub -sync y return code behaviour broken 1434 6221244 releasing user hold state through qrls may not require manager priviledges 1424 6221850 Request for start-up script additions 1473 6222237 huge CPU and memory overhead when modifiying complex attributes 1438 6222811 scheduler can get out of sync 1431 6222861 error message "no execd known on host" 1533 6222930 After shadowd takes over there is a long delay before execd connects to new qmaster 1449 6225570 sharetree has a usage leak 1436 6226085 suspend_interval is ignored when enabling jobs due to suspend_thresholds change - 6228350 Execd messages file contains incorrectly-formatted lines 1461 6228786 Long delay when starting up large pe jobs 1441 6229253 a parallel array job can kill the qmaster 1505 6229277 qselect uses sge_qstat file 1463 6229373 An array pe job can set queues into error state 1501 6229603 reprioritize parameter is NOT documented 1465 6230846 execd logs error mesage, when a tight pe job in "t" state is deleted 1458 6231366 deadlock in the qmaster due to qconf -k[s|e] - 6231376 N1GE Users Guide does not mention possibilities due to -b {y|n} option 1208 6231589 execd uninstall doesn't remove all objects 1454 6232074 load formula is not working for pe jobs 1468 6233162 global scheduler messages are reported multiple times - 6233173 qloadsensor dies sporadically 1494 6233300 Upgrade procedure should be more verbose wrts manual steps required to transfer 5.3 configuration 1504 6234371 error message from execd about endpoint is not unique 1453 6234836 Need a means to purge host or hostgroup specific cluster queue 1492 6235845 install script should create execd spooldir 1244 6236136 backup/restore for classic and rpc server spooling not supported! 1242 6236139 restore procedure does not really ensure qmaster is down - 6236261 BDB install on NFSv4 share 1076 6236469 JAPI: Can be made to start two event client threads 1422 6236472 qsub -sync y doesn't remove session directories 1470 6236475 DRMAA segfaults with > 255 threads 1472 6236476 NoClassDefFoundError: org/ggf/drmaa/NoResourceUsageDataException 1425 6239394 Spooledit fails during database upgrade - 6239461 load values adjustment on Windows execution host - 6239465 man can't display man pages - 6239470 Avoid that sge_execd has to be started by the Domain Administrator - 6239479 Improve installation documentation of Windows execution host - 6239492 Installation must stop when system is not set up right 1243 6239504 adminuser is not considered in autoinstall! 1502 6239569 qmaster does not accept new connections if number of execd's exceed FD_SETSIZE 1478 6239640 ./inst_sge -x fails with fqdn and no default domain 1356 6239655 inst_sge only deletes common, but not 1486 6239660 qmaster profiling doesn't start at qmaster startup 1439 6240739 qstat -s hu shows pending jobs only 1469 6241376 qstat -U aborts 1484 6241378 Reservation of wrong hosts 1462 6241401 Conflicting requirements should have the same meaning with qstat and qsub 1431 6241430 error message "no execd known on host" 1489 6241487 termination script may not be ignored, when job submited with -notify 1508 6241544 qstat -F dies in case of a infinit integer setting 1379 6242055 Consumable request may not be 0 if PE requested 1447 6242057 jobs which request consumable resources which are set to infinity are not scheduled 1471 6242165 Profiling library never frees thread slots 1512 6242172 Multi-threaded args parsing problems 1479 6242181 Failed drmaa_control (DRMAA_CONTROL_TERMINATE) causes deadlock 1362 6242779 qsub -now yes not working on CSP system 1365 6244215 qsub -b y must fail if no command is specified 1435 6244229 misleading qstat -j message when the scheduler is not running 1518 6244808 scheduler does not get all objects on a qmaster or scheduler startup 1520 6244865 a series of matching soft queue requests gets not counted separately 1395 6245486 sge_ca needs to export SGE_CELL 1524 6245487 qhost -h does not show selected host - 6246180 An ARCO installation example leads to a failure on a certain operating systems 1525 6247211 qstat -explain E does not print queue errors correctly 1529 6247238 qsub fails to work correctly with -b n -cwd 1450 6247239 sequence nr of execd load reports corrupted 1433 6247889 qsub -sync y return code behaviour broken - 6249252 Error in User Guide Table on qacct -j failed codes on p.122 - 6250186 ARCo decumentation should explain where is the file config.xml 1543 6251172 reserved jobs prevent other jobs from starting 1545 6251175 berkeleydb server shutdown script failes 1540 6251178 install_qmaster picks up commented out service sge_qmaster - 6251943 japi does not work with host aliasing 1551 6252465 qsub option parameter string only supports 2048 character strings 1548 6252522 qconf -purge queue hostlist all.q@host segfaults 1549 6252524 Missing success message with qconf -Aprj 1552 6253093 qstat -f -pe make breaks 1565 6253138 auto_inst uses ADMIN_HOST_LIST variable onl at qmaster installation time 1575 6253192 bdb rpc auto install does not work 1560 6253219 BDB RPC server with NFS spooling dir and master auto_install does not work 1554 6253266 failed array tasks are rescheduled only one by one 1573 6253278 auto_inst should ne be case sensitive for hostnames 1574 6253291 auto_inst uninstallation with fqnd does not work 1559 6253313 auto_inst -um does not uses configurationfile - 6254840 Install failure for execution hosts on multiple domains 1562 6255329 qmaster does not store sharetree usage on shutdown 1563 6255336 execd does sends empty job report for a pe slave task 1566 6255804 job in error state breaks qstat -f -xml 1567 6255850 the usage in projects is never spooled while the qmaster 1568 6255902 qmake in dynamic allocation mode core dump 1430 6256457 pe jobs disappear in t state (execd doesn't know this job) 1572 6256530 cqueues/all.q trashed after qmaster shutdown with 1362 hosts 1576 6257389 inst_sge -bup with rpc server destroy database 1579 6259380 potential qmaster sec. fault. 1585 6259993 inst_sge -bup does not backup shadow_masters file 1582 6260024 qmon cluster queue modify cancel not wor/1405 king correct - 6260704 qsub -sync is not interruptable once the job has been scheduled 1586 6260729 Can't select 'slots' in select box when adding consumables for execution host 1354 - install CSP problems on AIX43 1450 - sequence nr of execd load reports corrupted 1442 - Arguments to binaries sent to qsub are given to invoking shell too 538 - PATH size limit of 2048 characters 1405 - DRMAA Java language binding does not work in binaries 1404 - Clonable classes should implement Cloneable Bugs fixed in SGE 6.0u3 since release 6.0u2 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 1389 6205648 error in commlib read/write timeout handling 1401 6211243 The qstat -ext -xml command is broken with N1GE6 Update 2 patch 1400 6211309 qmaster running out of file descriptors 1392 6211725 uninstall of exec host doesn't work 1413 6215580 execd messages file contains errors for tight integrated jobs 1414 6216020 pending job task deletion may not work Bugs fixed in SGE 6.0u2 since release 6.0u1 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 790 5063315 Confusing Install Text: spooling method 791 5063317 Confusing Install Text: port numbers 1132 5071878 no man page for qping and gethostname binaries 1283 5075968 Thread enabled commlib coredumps on exit on a 32bit Solaris x86 box 1221 5085010 qmon customize filter for running jobs does not filter 1287 5086108 wrong message appears when queue instance becomes error state 1216 5089222 scheduling weirdness with wild-card PE's 1234 5089255 Submit to a queue domain is never scheduled 1313 5090162 qmake does not export shell env. vars 1253 5092487 hard resource requests ignored in parallel jobs 1224 5094016 o-tickets assigned to departments are ignored 1261 5095907 qacct -l is not working 1275 5097732 Need detailed error messages from communication layer 1274 5102320 memory leak in the scheduler, with pe jobs and resource requests 1270 5102340 drmaa_synchronize() waits for all jobs, including newly submitted jobs 1269 5102442 qconf -de crashes qmaster 1235 5104270 Cannot add calendar with \ syntax 1277 5104789 mail sent by qmaster leaves zombie processes 1176 5108635 $ARCH required in path for qloadsensor and qidle. 1251 5108639 qconf -sstree seg faults with large share trees 1304 6174301 N1GE6: qsub -js and negative job_share numbers acts strangely/unexpectedly. 1198 6174326 qconf -sq displayes "slots" in the complex_values line 1255 6174331 Option "-v VAR" does not fetch from envrionment 1286 6174821 segmentation fault when vmemsize limit is reached 1295 6174915 qconf has wrong exit status 1294 6176115 Show qmaster/execd application status in qping 1239 6176177 restoring a backup does not restore the job_scripts dir. 1291 6176181 qdel "" kills qmaster - 6178328 Admin/Users Guide: qstat has been enhanced. 1299 6180529 meaningless job error state diagnosis text in qstat -j 1251 6183365 qconf -sstree gives a SIGBUS error 1308 6184460 qmod -[d|e] cannot handle the folowing qnames: "[0-9]*" 978 6184466 scheduler does not look ahead to consider queue calendars state transitions 1307 6185136 Job customize shows weird characters for fields, additional fields cannot be added 1267 6185169 qmon returns an error dialog, when editing a calendar 1302 6185208 qmon and equal job arguments 1300 6185211 Job environments should not include Grid Engine dynamic library path 1315 6189286 memory leak in the scheduler with consumables as load thresholds 1316 6189289 a cluster queue can be deleted, even though it is referenced in an other cq 1279 6190164 too many array tasks are deleted - 6191366 tightly integrated pe jobs: scheduler doesn't respect usage of pe tasks in sharetree calculation 1324 6193348 qconf -mq does not output the subordinate_list correct 1323 6193361 Jobs fail in case of NFS execd installation on volumes exported without root write priviledges 1328 6193866 backup/restore does not work under Linux and others.. 1329 6194002 sgemaster -migrate on qmaster host tries to start second qmaster 1289 6194625 subordinate queues consume excessive memory 1335 6194713 Only first subordinate queue will be suspended at qmaster restart 1336 6194729 Subordinate queue thresholds are not spooled with BDB - 6195249 QMON Cluster Queue Window: Heading line words does not match into column width 1344 6196578 backup failes, when... 1345 6197253 DRMAA_DURATION_{H|S}LIMIT misspelled as "durartion" 1360 6199261 a sharetree delete can kill qmon 1357 6200013 util/arch script OS matching problems for Linux x86 and amd64 1280 6201033 qmaster might fail if jobs are deleted which have multiple hold states applied - 6201038 reduce the impact of qstat on the overall performance 1319 6201039 qconf -ks gives bad error message if scheduler isn't running 1317 6201040 Exit 99 jobs are not rescheduled to hosts where they ran before 1030 6201042 qdel "*" produces error logging in qmaster messages file Bugs fixed in SGE 6.0u1 since release 6.0 ----------------------------------------- Issue Sun BugId Description -------- ----------- ------------------------------------------------------------------------------------------ 1090 5062683 Install script fails when sgeadmin is selected as install user. 1082 5063305 remove stat_log_time 1087 5063311 high memory usage of schedd and qmaster (schedd_job_info) 1091 5063316 PE job submit error, when qmaster is busy 1098 5063987 qmaster cannot bind port below 1024 on Linux 1122 5071498 projects not available after sge_qmaster restart 1111 5071502 calendars broken 1110 5071522 Startup of qmaster changes act_qmaster to `hostname` 1109 5071525 qalter abort 1124 5071539 qping doesn't support host_aliases file 1130 5071868 uninstall procedure doesn't remove the rc-script of execd! 1133 5071914 scheduler ignores queue seqno for queue sorting 1131 5071918 qmod -e '@' causes segmentation fault in qmaster 1104 5071987 Qmaster requires a local conf in order to start. 1135 5071999 inst_sge -sm doesn't create a local_conf 1150 5072005 drmaa_run_job() may change the current directory 1117 5072481 Deleted pending job appears in qstat 1129 5072772 sge_qmaster constantly rewrites spool files of tightly integrated parallel jobs 1146 5073218 qconf -aq @ crashes qmaster 1154 5074788 jobs on hold due to -a time cause qmaster/schedd get out of sync 1094 5075346 Sharetree doesn't work correct 1118 5075398 variable syntax : equal sign support 1139 5075451 sched_conf(5) reprioritize_interval should default to 0 1099 5075849 a registering event client can get events before it got its total update 5075936 qmon's queue filtering doesn't work 5076358 It shuld be used "." and "$" with qsub -N 5076372 "|" should be able to be used with qsub -N 1126 5076491 qmaster clients may not reconnect after qmaster outage 1140 5077165 reprioritize_interval descr in sched_conf(5) needs improvemen 1097 5077167 NO_REPRIORITATION should be removed from man pages 1146 5077549 qsub -N "@" causes qmaster down 1141 5077589 schedd and qmaster get out of sync - no scheduling for long time 1162 5078783 Wallclock time limit in qmon 1113 5079514 execd shutdown with sgeexecd fails when host aliases are used 1178 5079572 Resending queue signals broken 1183 5080779 qconf -de host does not update the host groups 1168 5080784 qselect crash 1081 5080833 qconf -mattr dumps core if used incorrectly 5080836 qhosts outputs NCPU as float 1092 5080839 qconf -mq displayes "slots" in the complex_values line 1172 5080840 problems when qconf -mattr is used in conjunction with host_aliases file 1109 5080851 qalter/qdel/qmod abort 1146 5080852 qconf -aq @ crashes qmaster 1151 5080853 DRMAA doesn't reject jobs that never will be dispatchable 1161 5080856 QCONF: qconf -mc segfaults 1191 5081821 qstat XML output typo 1175 5081822 Deleting a queue instance slots value actually adds it 1186 5081839 qconf -ahgrp fails if no hgrp name is specified 1192 5082490 qstat -ext -urg omits time info 1185 5083102 hostgroup changes do not always take effect. 1101 5083115 Need more verbose diagnosis msg if execd port is already bound 1207 5084317 Invalid job_id's in reporting file (only l24_amd64) 1214 5084927 install script fails without SGE_ROOT 1144 5085392 qstat -j -xml generates no parseble xml output 1219 5085507 sge_inst restore does not work 1222 5085508 sge backup cannot override older backups 1233 5087268 Parsing of @f* hostgroups plus some minor issues 1170 DRMAA attributes table is garbled 1112 Wrong entry SGE_QMASTER_PORT in qconf man page 1181 qstat should indicate cluster queue load is different from queue instance load 1190 qstat(1) must explain cluster queue load concept 97 exit code implications for various methods 1103 SGE 6.0 manpage indicates qsub accepts "-r yes|no"; only "y|n" accepted 1083 remove stat_log_time from configuration dialogue