How to fix your Python virtualenv after a Homebrew Python upgrade

Do you manage your OS X Python installation with Homebrew? Have you recently upgrade Python?

In my case it was a Python 3 point release; from 3.4.2 to 3.4.3. The downside is that this is enough to invalidate virtualenv's symbolic links.

Example

~: cd ~/src/my_app
~/src/my_app: source venv/bin/activate
[venv] ~/src/my_app: python
dyld: Library not loaded: @executable_path/../.Python
  Referenced from: /Users/jeremycade/src/my_app/env/bin/python
  Reason: image not found
Trace/BPT trap: 5

As I've mentioned the virtualenv symlinks to your Homebrew python installation no longer link to the correct locations. The solution is to delete, and regenerate virtualenv symlinks.

First thing, we need to ensure that the your virtualenv is not active.

[venv] ~/src/my_app: deactivate

Next, delete the offending symlinks.

~/src/my_app: find venv -type l -delete

In this case I've made use of the BSD find command that ships with OS X.

The final step is to recreate your virtualenv.

~src/my_app: virtualenv venv

Connecting to Microsoft SQL Server with python in OS X Yosemite

Thanks in part to my technology ADD, and desire to be a general language polyglot; it is by no means unusual to have a mixed technology stack.

In this case I have a number of python micro-services and command line applications, running on a Mac Mini loaded with OS X Yosemite, that need to communicate with Microsoft's SQL Sever 2008 R2, on top of Windows Server 2008 R2.

While SQL Server is a great Relational Database Management System (RDBMS), it isn't exactly known for it's ease of use when it comes to cross-platform communication. A quick Google Search will return a range of results (and horror stories) on how to get this all to work.

Getting Started

It is assumed that you already have a working Python installation, either in the Python 2.7+ or Python 3.3+ families and Pip.

Additionally, we will need:

Decisions and Rationale

I personally prefer to use Homebrew due to it's simplicity; though I do have a MacBook with Macports installed.

The pymssql vs pyodbc is not one I really care to worry about. Though I generally choose to use pymssql due to better support for MS SQL Stored Procedures, and a less troublesome install. In the past I have had issues installing pyodbc via Pip.

Installing FreeTDS

FreeTDS is a set of *nix libraries, that allow applications to talk to Microsoft SQL Server.

In your typical *nix environment FreeTDS sources configurations from two difference places:

${HOME}/.freetds.conf
/etc/freetds.conf

This is slightly different under OS X, and will vary depending on the package manager used to install FreeTDS. FreeTDS reads the ${HOME}/.freetds.conf before resorting to the global freetds.conf. You can find out what the location of your global freetds.conf by executing the following command:

tsql -C

Your result will be something along these lines:

$ tsql -C
Compile-time settings (established with the "configure" script)
                            Version: freetds v0.91
             freetds.conf directory: /usr/local/Cellar/freetds/0.91_2/etc
     MS db-lib source compatibility: no
        Sybase binary compatibility: no
                      Thread safety: yes
                      iconv library: yes
                        TDS version: 7.1
                              iODBC: no
                           unixodbc: no
              SSPI "trusted" logins: no
                           Kerberos: no

Full details about the FreeTDS configuration files can be found here.

Homebrew

brew install freetds

Homebrew symlinks the global freetds.conf file to:

/usr/local/etc/freetds.conf

Macports

sudo port install freetds

Macports installs the global freetds.conf file to:

/opt/local/etc/freetds/freetds.conf

Testing FreeTDS

tsql -H <host> -p <port> -U <username> -P <password>

If everything is working you will see something along the lines of:

locale is "en_AU.UTF-8"
locale charset is "UTF-8"
using default charset "UTF-8"

My Macports machine presented the following errors:

locale is "en_AU.UTF-8"
locale charset is "UTF-8"
using default charset "UTF-8"
Error 20018 (severity 9):
    Unexpected EOF from the server
    OS error 36, "Operation now in progress"
Error 20002 (severity 9):
    Adaptive Server connection failed
There was a problem connecting to the server.

This was overcome by specifying the TDS Version as an Environment Variable like so:

TDSVER=7.0 tsql -H <host> -p <1433 or custom port> -U <username> -P <password>

Your other option for overcoming this type of error is to specify DNS names within your local freetds.conf file, for example:

[myhost]
    host = dbdev.my-dev.com
    port = 1433
    tds version = 7.0

You can this test this by using the following command:

tsql -S myhost -U <USER> -P <PASSWORD>

Installing pymssql

pymssql is easily installed on your machine via Pip.

pip install pymssql

Once installed you can test the package in a fairly simple manner via Python interactive.

$ python3 -i
Python 3.4.2 (default, Oct 17 2014, 20:25:14) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.51)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymssql
>>> conn = pymssql.connect("HOST","USER","PASSWORD","DATABASE")
>>> cursor = conn.cursor()
>>> cursor.execute("select * from dbo.[TABLE]")
>>> for row in cursor:
...     print("row = %r" % (row, ))
... 
row = (1, 'LITERAL', Decimal('123.123'))
row = (2, 'LITERAL', Decimal('123.123'))
row = (3, 'LITERAL', Decimal('123.123'))
row = (4, 'LITERAL', Decimal('123.123'))
row = (5, 'LITERAL', Decimal('123.123'))
row = (6, 'LITERAL', Decimal('123.123'))
>>> conn.close()
>>> exit()

pymssql adheres to the Python DB-API 2.0 specification. Meaning that all of the standard cursor methods are available. It also allows you to use ORM packages like SQLAlchemy. pymssql also allows you to call MS SQL Stored procedures via the cursor.execproc method, which for me is a must have feature, particularly when dealing with odd inflexible (EVERYTHING MUST BE A SP) DBA.

Generating Page Slugs in T-SQL!

I recently came across a situation where I needed to generate page Slugs from some 5,000,000+ records in a MS-SQL database. Unfortunately I didn’t have Remote Desktop or Telnet access to the SQL box which precluded me from running one of my normal C# or Powershell solutions, and the idea of doing all of the processing over the wire just didn’t appeal to me. So I decided the best course of action would be to use T-SQL to generate the Slugs for me.

After posting the question to StackOverflow and doing some searching on Google I came across Pinal Dave’s UDF_ParseAlphaChars function. Luckily it had most of the functionality I required, so it only required minor modifications to get the desired result.

The modified code is below.

 1 CREATE FUNCTION [dbo].[UDF_GenerateSlug]
 2     (
 3         @str VARCHAR(100)
 4     )
 5     RETURNS VARCHAR(100)
 6     AS
 7     BEGIN 
 8         DECLARE @IncCharLoc SMALLINT
 9         SET @str = LOWER(@str)
10         SET @IncCharLoc = PATINDEX('%[^0-9a-z] %',@str)
11         
12         WHILE @IncCharLoc &gt; 0
13         BEGIN
14          SET @str = STUFF(@str,@IncCharLoc,1,'')
15          SET @IncCharLoc = PATINDEX('%[^0-9a-z] %',@str)
16         END
17         
18         SET @str = REPLACE(@str,' ','-')
19         RETURN @str
20         END 
21     GO

First of all I need to leave spaces so they can be replaced with hyphens, and secondly I only require lower case characters. You may also notice that I'm doing a case transformation against the incoming string. This is due to the SQL database being setup with case insensitivity.

I would like to thank Pinal Dave for his original code.

Software Architecture Evolution Over Time.

Works by Parnas and Shaw set the stage for the development of architectural design patterns that enabled later computer scientists to develop both architectural and module level design patterns. This has empowered architects and developers with the ability to build highly cohesive, loosely coupled large-scale systems.

1. Introduction

The development of codified software design patterns over the last 30 years has been of significance to the software industry. With each new design pattern a new level of abstraction has been developed to simplify or iron out complexity in the systems in where those patterns have been implemented.

However in order to understand the overall architecture of a system we first need to be able to decompose the system into subsystems or modules, with each subsystem or module adhering to a design pattern in order to maintain some sort of maintainability and testability.

2. Discussion

In his 1972 paper, "On the Criteria To Be Used in Decomposing Systems into Modules", Parans discusses the decomposition of software into modules, producing two separate modularisations.

The first is along the lines of functional/procedural responsibility or steps, referred to as the "flowchart" method [1]. The second is along the lines of "Information Hiding".

Parans says of the second modularisation: "Every module in the second decomposition is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings." [1]

These two sentences are of great interest, as this was the first time the idea of Information Hiding was introduced to the world of Software Architecture and Design.

This idea, sometimes referred to as the Black Box Principal, is of paramount importance to software design, as it is the founding basis for a number of Object Oriented design principals including Encapsulation and the practice of designing to Interfaces rather than concrete classes.

The Black Box Principal allows for a module or class to built in such a way that only its input and output are known, while hiding its internal workings, implementation, or design decisions from consuming applications, modules or classes.

This allows for the design and creation of loosely coupled software systems, which in turn increase testability and reduce the chance of failure due to cascading errors in the event of changes to the module or classes internal design or implementation.

This principal, is in fact an abstraction of the inner workings given module or class in order to implement it’s output in a larger system.

In her 1989 paper "Larger Scale Systems Require Higher-Level Abstractions" Shaw makes the argument for the need of codified high-level design patterns that can be used to make abstracted design decisions at the system and subsystem levels, in order to better understand and produce large-scale systems [2].

In section "3. Composition of Systems from Subsystems", Shaw makes the observation that large systems are constructed by combining subsystems, and that these subsystems have their own internal structures, which could be designed at the system level, rather than the module or class level [2].

This allows for the implementation of higher-level design patterns for specific subsystems that are best suited to their function or responsibilities

However Shaw makes the conclusion that the at the time of publication (1989) Software Architecture was not yet mature enough to develop or support such high-level patterns [2].

As we can see, both Parans and Shaw’s work address software design from a structural, decomposition standpoint, while this may have been the founding for a lot of today’s current structural architecture level design patterns, it fails to take into account the other areas software design that are important in todays object oriented world, specifically those of object creation, structure and behaviour in complex software systems.

Neither of the papers takes into account the inherent complexity of trying to integrate multiple modules to form complex systems, while maintaining portability, testability and high maintainability.

Thankfully, some of these issues have been addressed through the creation of modern architectural and module / class level design patterns.

3. Critical Analysis

Shaw’s paper is a spiritual continuation of Parnas’s earlier "Information Hiding" (Black Box) work, in that Shaw is in fact advocating for the abstraction of lower level modules functions behind a subsystem design pattern, effectively extending the Black Box Principal beyond that of a single module or class to an entire subsystem. This continuation is evident in section 2 of Shaw’s paper where the Software Architecture Level of Design is discussed. Shaw states: "This level of system organization and design is the software architecture level." [2]. Parnas, in his work, did not mention, or perhaps failed to anticipate that large-scale systems may require a number of levels of abstraction in order to effectively produce the system. Shaw’s work rectifies this by explicitly advocating the need for architecture level design.

Architecture level design, is of course a major factor that must be considered when developing systems for modern systems, the choice of architecture level design patterns is of paramount importance when developing large scale systems, as the choice of the wrong design pattern could have catastrophic consequences, for example the Model View Controller pattern, developed at XEROX PRACE in 1978-79 [3] may provide the foundation for a Graphical User Interface (GUI) based system such as a Word Processor, but would be ill suited to a Telephone Exchange System or a Control and Command System.

While Shaw addressed the higher-level concepts of System and subsystem architecture, she failed to address the issues related to lower level modules and classes, thankfully a large number of these issues have been addressed through the acceptance of the patterns and practices published throughout the early 90’s and early 2000’s. Books such as Code Complete [4] and the Design Patterns: Elements of Reusable Object-Oriented Software" [5] introduced a large number of developers and architects to reusable design patterns that allow for the creation and maintenance and testability of modules or classes in a decoupled, highly cohesive manner.

For example, Robert C. Martin in his book "Agile Software Development: Principals, Patterns and Practices" [5] introduces a set of principals known as SOLID. SOLID is an acronym for Single Responsibility, Open-Closed, Liskov Substitution, Interface Segregation and Dependency Inversion. These principals extend on Parnas’s information hiding principals, and Shaw's module abstraction to address the issues related to testability and maintainability in complex systems. For example, the Dependency Inversion principal states that high level modules or classes should not depend on low level modules or classes. This principal is implemented via the Dependency Injection pattern, which extremely similar to the Factory Method [4][5]. Dependency Injection allows for decoupling of modules, which in turn makes it easier to unit test modules.

Well-tested, decoupled modules often have higher levels of cohesion, allowing those modules to be more portable, readable and maintainable in the long term.

There is however still a number of open issues from a software architecture standpoint, as the underlying hardware increases in capacity and complexity the need to continually abstract the inner workings to increase design simplicity of each system is a cause for ambiguity, and misunderstanding. Shaw did mention this in her paper, however, it may well result in a time where those making the design decisions no longer understand the underlying rationale for the design pattern that they have chosen to implement. At a lower level this is evident in the uptake of memory managed programming environments, e.g. Java. It could be argued that Java developers are more ignorant of their memory footprint, compared to C developers. There is a real chance that this could happen at a higher level of design as we continually develop new ways to abstract away from the inner workings of our systems.

4. Conclusion

Parans work created the foundations for a large number of design patterns still implemented in today’s moderns software systems. Shaw extended Parans work in a way that allowed for the abstraction of design decisions to higher levels of a software system. This work has since been extended upon by a number of different people, allowing for reusable software design patterns that give us the ability to build highly cohesive, decoupled systems that are easily tested.

However we may face a time in the future where the continued abstraction of the design decisions leads to an ignorance of the base implementation of a system.

5. References

[1] David L. Parnas. "On the Criteria to be Used on Decomposing Systems into Modules," Communications of the ACM, 15(12):1053-1058, 1972.

[2] Mary Shaw. "Larger Scale Systems Require Higher-Level Abstractions," Proceedings of the Fifth International Workshop on Software Specifications and Design, published 1989

[3] Trygve Reenskaug. ∑"MVC XEROX PARC 1978-79″, http://heim.ifi.uio.no/~trygver/themes/mvc/mvc-index.html

[4] Steve McConnell. "Code Complete" Microsoft Press. 1994

[5] E. Gamma, R. Helm, R. Johnson & J. Vlissides. "Design Patterns: Elements of Reusable Object-Oriented Software", 1994

[6] Robert C. Martin. "Agile Software Development: Principles, Patterns, and Practices " 2002

Knowledge Areas for IT Project Management

In his article, "10 ways to avoid mistakes during project development", Alan Norton describes 10 items or methods that can be used to avoid critical mistakes during the project planning and management processes (Norton, 2010). Each of these 10 items or methods can be addressed through the application of the 42 project management activities, grouped within the nine Knowledge Areas, as described in the Project Management Body of Knowledge, otherwise known as the "PMBOK".

Learn from other’s mistakes.

Mistakes are often the result of poor project planning and/or process execution, something which can easily be prevented during project the planning stages (Project Scope Management and Project Risk Management) and if need be remedied during both the execution stages and monitoring stages (Project Quality Management) of a project.

During the planning stages a clear set of goals or defined requirements greatly aids in the Project Managers ability to properly assess the risks associated with the project. The Collect Requirements and Define Scope processes associated with the Project Scope knowledge area are fundamental to this action. Having a clearly defined set of goals allows for the Project Manager to perform in depth Qualitative and Quantitative Risk Analysis, along with the development of a Risk Response plan, all of which are associated with the Project Risk

Management knowledge area.

In the event of an issue or mistake, the Quality Assurance and Quality Control Procedures associated with the Project Quality Management knowledge area, could be used to rectify the issue, and to make sure the project meets the required project goals.

Do your research first.

As Norton states, "there is little excuse for mistakes made because you didn’t do the proper research in advance" (Norton, 2010). Proper research should be performed during the planning phrase of a project. As such there are a number of processes available to a project manager to make sure that he or she has the correct information at his or her disposal, specifically, collection of project requirements, definition of project scope and the creation of a Work Breakdown Structure or WBS. All of which are part of the Scope Management knowledge area.

The Work Breakdown Structure is perhaps the most important deliverable at this stage, as it not only defines the scope of the project, but it also gives the project manager an overview of all required work within a project. This in turn allows for the proper execution of each unit of work required for successful completion of the project.

Have a plan.

This is perhaps, the single most important item on Norton’s list, and is best served by developing a project management plan, associated with the Project Integration Management knowledge area.

A Project Management Plan is a deliverable item, used to coordinate all planning activities and execution of the project. Development of a Project Plan requires knowledge of all aspects of a given project. As such the project manager should consult with members of the project team as well as the applicable stakeholders. The Project Plan, while unique should include information relating to project objectives or goals, project organisation, timeline and of course budget.

Follow standards and use templates.

The use of templates offers a number of benefits, not only cost and time savings, but there are also quality benefits. Templates and other forms of standard items are often purchased, or acquired from an outside source. Be that a different department or team within an organisation or an external organisation altogether. In this case, a purchasing or acquisition decision would need to be made. This procedure is known as Planning Procurements, and falls under the Project Procurement Knowledge area.

In order for effective procurement, the project manager should develop a Procurement Management Plan. This plan should detail how the procurement process will be managed and maintained for the duration of the project.

Communicate and coordinate with others.

Communication is quite possibly the most important part of every project (Schwalbe, 2010). Regular communication between team members along with stakeholders is a crucial component of success. As such there are a number of processes dedicated to the effective management of communication, all of which are associated with the Project Communications Management knowledge area. As such is it important that each of the processes association with this knowledge area are completed in full and to the best ability of the project manager and associated team members.

As such an effective Communications Management Plan document, produced from an efficient use of the Plan Communications process helps to mitigate any potential issues. Another factor that may come into play is the choice of communication medium. For instance email may be fine for internal team communications, but hard copy (printed documents) may be better for stakeholder progress reports.

Allow enough time.

It is no secret that IT Projects have historically run over schedule and over budget, making Project Time Management an extremely important knowledge area. In order to allow enough time for a project to be completed, an effective Project Schedule must be developed.

However a Project Schedule cannot be developed without defining the actives that need to be complete along with estimating how long each of those activates will take to complete. Both of which rely on effective communication between the project manager, stakeholders and team members, along with an established WBS. The WBS acts as a de facto activity list, which is in turn broken up into more defined activities. Once the activities list has been defined it can be sequenced to find interrelated or dependent activities, in order to help prioritise the activity work list, and estimate the required resources for each activity.

Once completed, consultations with the team members responsible for a particular activity should be conducted in order to properly estimate the required duration of each activity.

Reuse proven code.

Reusing proven technologies is great way to save time and to maintain quality of a project. In order to make use of existing technologies it is important to know if there are technologies that are applicable to your project. As such the Project Scope Management and Project Procurement knowledge areas are fundamental to the assessment of these technologies.

For example without knowing the project requirements it would be impossible to assess if a particular technology is applicable to your project. Or, how do you assess a technology to see if it was applicable without a procurement planning detailing the selection criteria for external acquisitions.

Use Checklists.

Checklists are most commonly associated with the Project Quality Management knowledge area, as they are commonly used during the Quality Assurance and Quality Control processes of a project. Checklists offer a simple way to establish whether or not a project has met the criteria or goals set out in the Project Plan. They can also be used to provide a simple framework for Stakeholders and team members to sign off on units.

Test, test, test.. and carefully review your work.

Quality Assurance is quickly becoming one of the most talked about aspects of IT Project Management, as again, it is no secret that IT Projects are far from being error free. That mere fact alone makes the Project Quality Management knowledge area of great importance to modern IT Projects. This is especially true of the Quality Assurance process.

The Quality Assurance process enables project managers to quickly gauge the health of an IT projects through the use of benchmarking and quality audits. Benchmarking allows for comparative assessments of current and previous projects, along with providing immediate feedback, allowing for adjustments to be made to the project in order to deliver the necessary level of quality.

Quality audits allow for specific metrics to be applied to a particular activity in order to assess its ability to meet the goals of the overall project.

Test again with a third party.

Again, as pointed out previously IT Project Quality is of great importance due, as such expert user or third party testing is of great importance. Third party testing can be facilitated through the use of the Project Human Resource Management and Project Management knowledge areas.

The Acquire Project Team process from the Project Human Resource knowledge area facilitates the acquisition of third party users or testers, while the Project Quality Control process from the Project Quality Management knowledge area provides the framework needed for those users/testers to perform quality control on the project.

Summary

Of the addressed knowledge areas there are four that standout as having the most impact when considering Alan Norton’s "10 ways to avoid mistakes during project development".

They are, in order: - Project Scope Management - Project Communications Management - Project Quality Management - Project Procurement Management

However, this in no way diminishes the impact of the other knowledge areas as a whole. Each knowledge area addressed plays a critical role during project management process.

Special attention should be paid to the following knowledge areas during each phase of the project as they appear to be reoccurring issues when dealing with IT Projects:

  • Project Communications Management
  • Project Quality Management
  • Project Time Management

References

Norton, A. (2010). 10 ways to avoid mistakes during project development. Retrieved from http://blogs.techrepublic.com.com/10things/?p=1360

Schwalbe, K (2010). Information Technology Project Management, Sixth edition. Boston, MA: Course Technology