2600hz Crosstalk

  • Archive
  • RSS
  • Ask me anything
CouchDB/BigCouch Bulk Insert/Update

This post is cross-posted from James Aimonetti’s personal blog. 

Couchdb-logo

While writing a bulk importer for Crossbar, I took a look at squeezing some performance out of BigCouch for the actual inserting of documents into the database. My first time running all the documents into BigCouch at the same time resulted in some poor performance, so I went digging around for some ideas on how to improve the insertions. Reading up on the High Performance Guide for CouchDB (which BigCouch is API-compliant with), I started to play with chunking my inserts up to get better overall execution time.

Note: The following are very unscientific results but I think they’re fairly instructive for what one might expect.

couch post spreadsheet

Based on the CouchDB guide, I decided to not pursue this further, as dropping insertion time two orders of magnitude was fine enough for me! I may have to bake this into the platform natively.

For those interested in the Erlang code, it is pretty simple. Taking a list of documents to save, use lists:split/2 to try and split the list. By catching the error, we can know that the list is less than our threshold, and can save the remaining list to BigCouch. Otherwise, lists:split/2 chunks our list into one for saving, and one for recursing back into the function. Since we don’t really care about the results of couch_mgr:save_docs/2, we could put the calls in the second clause of the case in a spawn to speed this up (relative to the calling process). 

Code

  • 3 weeks ago
  • Permalink
  • Share
    Tweet
CouchDB/BigCouch Bulk Insert/Update

This post is cross-posted from James Aimonetti’s personal blog. 

Couchdb-logo

While writing a bulk importer for Crossbar, I took a look at squeezing some performance out of BigCouch for the actual inserting of documents into the database. My first time running all the documents into BigCouch at the same time resulted in some poor performance, so I went digging around for some ideas on how to improve the insertions. Reading up on the High Performance Guide for CouchDB (which BigCouch is API-compliant with), I started to play with chunking my inserts up to get better overall execution time.

Note: The following are very unscientific results but I think they’re fairly instructive for what one might expect.

couch post spreadsheet

Based on the CouchDB guide, I decided to not pursue this further, as dropping insertion time two orders of magnitude was fine enough for me! I may have to bake this into the platform natively.

For those interested in the Erlang code, it is pretty simple. Taking a list of documents to save, use lists:split/2 to try and split the list. By catching the error, we can know that the list is less than our threshold, and can save the remaining list to BigCouch. Otherwise, lists:split/2 chunks our list into one for saving, and one for recursing back into the function. Since we don’t really care about the results of couch_mgr:save_docs/2, we could put the calls in the second clause of the case in a spawn to speed this up (relative to the calling process). 

Code

  • 3 weeks ago
  • Permalink
  • Share
    Tweet
How Whistle Offers the Accessibility Google Doesn’t

(Cross-posted on our main blog) 

If you’ve been paying attention in the tech world this morning, you saw this post by Steve Yegge, Google engineer, (pictured right) floating around. It was a rant he intended to only send internally that instead got Google+ (Google Plussed? Still trying to figure that one out) to the general public. While he removed his original post out of his own volition, there are many reblogs of it still floating around, getting Google many pats-on-the-back for not tearing them all down.

yegge

His rant begins with his experience at Amazon and working with Jeff Bezos. While Yegge doesn’t seem to care too much for the managerial or developer culture at Amazon, he does tip his hat to the platform that was built (at both a high and technical level) and Bezos’ insistence on everything—internally and externally facing—being a consumable service. Honorable mentions were also made for Apple, Facebook and Microsoft’s platforms as well. His intentional omission of Google from the list of titans is quickly followed by his undressing of Google’s platform offerings.

He writes, “We don’t get Platforms, and we don’t get Accessibility. The two are basically the same thing, because platforms solve accessibility. A platform is accessibility.” A comparison of Amazon and Google’s Developer pages leaves Yegge red in the face, lambasting Google’s offerings as “childish APIs”. Case in point, he points out, when Google+ launched one consumable API. “And where are the Maps APIs,” he asks.

From day one, development of Whistle has been in the spirit of Amazon’s platform, not Google’s product. We know that we cannot anticipate or react enough to all of our users’ ideas and requests, which is why we’ve built APIs at every level of our stack and are continually growing the amount of control and level of abstraction those APIs provide the developer.

Accessibility for Web Developers
For the web developer, we’ve created a RESTful endpoint into the stack, providing high-level actions like creating accounts and callflows, configuring devices and voicemail boxes, provisioning servers and phones, and much more. Even more real-time interaction with the underlying switches are in the works. Our GUI, built on top of our Javascript framework Winkstart, is a consumer of those REST APIs. Everything the Winkstart GUI has access to, any other developer can take of as well (no back doors).

Accessibility for App Developers
For those wanting to write telephony applications, we have an AMQP message bus, upon which we have APIs for interacting with the switches in real time. Typical dialplan commands like play, say, bridge, and hangup, are available to the WhApp developer to control calls throughout the lifetime of the call. All of our WhApps, from registrar to callflows, makes use of these APIs. And because AMQP is language-agnostic, developers in any language with an AMQP client can take advantage of them too.

logo_aws

One step further
Unlike Amazon, we take this accessibility one step further. With Amazon, should their servers go down, it takes their APIs down with them and your application becomes unusable as a result. Services built on their APIs are beholden to them to keep their infrastructure usable and responsive (a generally safe bet, but who doesn’t remember the 5-day AWS outage back in April?) Because our components run on the vast majority of hardware/operating systems and the code is freely available (via Github or through our deploy tool), companies are free to build their own clusters, utilize third party hosting services like Rackspace, or use a hybrid where control servers are on cheap virtualized servers, and the media servers that do the heavy lifting are on hosted, physical servers.

Stay in Control
We also diverge from Facebook’s “walled-garden” platform. Facebook owns and uses its users’ inputs (posts, pictures, likes, pages, etc.) to sell data to advertisers, but where are the export tools? Lets say an appalled user wanted to transfer all their photo albums to another service or to their laptop’s hard drive. Once on Facebook, control of that data is completely lost.

While we offer a hosted service, moving your data from our service to your own hosted platform is as simple as an HTTP request (thank CouchDB replication!) and moving from your own hosted solution to our service is equally painless. So no matter what your business’ needs are, moving the data around is simple and you get to stay in control.

Yegge’s rant draws important attention on building a platform versus a product. And as more “cloud services” arise, the ability for developers to inter-operate with those services via meaningful APIs and use them in unanticipated original ways is of paramount importance. We believe our platform embodies this sense of openness and flexibility, and we are genuinely excited to see how the community will take the building blocks we’ve provided and apply them in fresh and novel ways. Stay tuned for upcoming posts reviewing some of our current clients and developers and what they’re doing to build the next killer app on top of our platform. 

  • 7 months ago
  • Permalink
  • Share
    Tweet
The Data Center War and the Cost of a Phone Call

There is a little-noticed war going on in the data center. The fight used to be for data center pricing and rapid buildout. When everyone was moving to hosted, providers couldn’t find enough data center space.

Now the war has shifted to power.

As the need for processing speed, redundancy, data storage and distribution grows, data center demand has gone up - both in space and power - but servers have become more powerful and compact. The result is a collection of thousands of servers crammed into a space that wasn’t designed for the amount of power they demand. Data centers have been unable to catchup with the power demand, so they’ve begun raising prices or have capped their users altogether.

Companies haven’t stopped needing more data center capabilities, though. Instead, they are finding new and innovative ways to meet this problem –– effectively changing the data center industry altogether. The response has been two-fold: hardware manufacturers are building more powerful equipment with lower power requirements, and software developers are coding software to fit in smaller memory spaces and utilize optimized code.

In essence, the shift from “bigger, faster servers” is becoming “smaller, cheaper, more optimized servers.”

With the move to distributed computing, this is quickly becoming known as the Micro Cloud.

As proof of this shift, Intel entered the war with their power efficient server microprocessors. Their new micro servers allowed for more power to be distributed among many smaller servers. Unfortunately, microprocessors only accounted for 1/3 of the power consumption in a server. To handle the remaining 2/3 of that power problem, server manufacturer SeaMicro responded with their solution — massively packed servers that supposedly handle the same amount of computing data in a sixth of the space while using a quarter of the power. The same article on Intel notes that “cloud computing is the only area where Intel expects to see significant demand change.”

From: http://www.altica.co.uk/files/images/cloud-cost.jpg

The way we see it, there are two problems here. There’s a power issue as well as a cost issue, but the cost issue has been completely ignored in both these referenced articles. The reason there is an expected “significant demand change” is because the cloud will effectively be the answer to this problem. Take this telecom example (if you couldn’t tell already, we are in the industry after all): 

Currently, you can use FreeSWITCH to make 1,000 calls on a 1U Dell 1850 550 watt power supply at 1.5 amps. You will pay around $2,000 for a server where you can fit 8 in a single rack in a datacenter using 12amps of power. Lets say you want to be protected from failover so you use one as a spare server. Your total cost comes out to roughly $16,000 for 7,000 active calls. 

And if you ran Whistle? You could make 400 calls on a 1U Supermicro 5015A-EHF-D52 200 watt power supply at .3 amps. You will probably pay around $350 for a server where 40 can fit in a single rack at a data center using the same 12 amps of power. You’re already saving money, so you buy two spare servers in case of failure. Your total is $14,000 for 15,200 active calls.

More important, it’s easier to add the $350 servers one at a time as you grow, versus the $2,000 purchases for a bunch of unused capacity. This also helps solve the power problem, as there are less servers and processors sitting around being only partially used.

Now the hard part for software companies is to shift their efforts to making their software run effectively on a Supermicro and then making it easy to hook 40 of them up to work in unison. That’s the entire focus of our company – utilize bleeding edge virtualized or commodity computing equipment to save power and money while improving management and redundancy for VoIP – all while ensuring quality.

What does this all really mean? Currently, you are paying roughly $2.28 worth of hardware for each active channel you can support. With Whistle, the hardware for running an active channel costs you $.92. That just saved you ~60% on hardware without even touching the power footprint. And at the rate the industry is changing, it is expected that power and software requirements to run the same amount of calls will only drop further. 

All of this is driving the demand to rethink the data center and hosting efforts – this is the real definition of the new Cloud Computing industry. Micro and Private Cloud FTW.

  • 9 months ago
  • 1
  • Permalink
  • Share
    Tweet
PropEr Testing in Crossbar: Solving the problem of Unit Testing in Erlang
We’re big believers in the importance of testing our software regularly and automatically. One way to do that is through Unit Testing. Unit Testing allows a programmer to write a list of tests that ensure the functions within a module perform as expected. For example, if you wrote a summing function that took two integers and returned the sum, you might unit test that function by passing 2 and 2 and asserting the return value was 4. You can also do negative assertions, such as passing 2 and 3 and asserting the result is *not* 4.

While the above example is simplistic, the real-world benefits of testing functions are that as functions evolve, unit tests act as a level of sanity checking; if a change invalidates a formerly passing test, either the test is no longer relevant or the code change doesn’t take into account some facet of the intent of the function.

Although it is important to have good coverage in unit tests, it does not ensure all corners of the function have been explored; the tests are limited to what the programmer(s) can imagine as test cases. What happens when the summing function is passed 2 and ‘a’, or 5 and 2.1234572e65? If the language doesn’t support infinitely large numbers, passing two really large integers (individually valid) could return invalid results (due to integer overflow). While an experienced programmer might know to test this kind of issue, others may not. As the complexity of the function being tested increases, the ability to fully test the range of inputs and outputs increases as well. Standard unit testing becomes less capable of guaranteeing a given function operates correctly across the range of inputs it could receive.

Enter PropEr - a QuickCheck-inspired, property-based testing tool. PropEr shifts the testing perspective from specific input/output combinations to testing the properties (or assumptions) of a given function.

Brief Introduction to PropEr

PropEr (PROPerty-based testing tool for ERlang) is an open-source, property-based testing tool for Erlang, brought to us by the team that develops Dialyzer, Tidier, and TypEr. The short version is that unit-testing doesn’t guarantee that you’ve tested all the corners of your function. By defining the types of input the function expects and the resultant output, PropEr generates random test cases and evaluates the function against those randomized inputs, often finding bizarre combinations of inputs. In our summing function above, instead of testing that 2 and 2 results in 4, we would test that passing an integer() and an integer() results in an integer(). PropEr then generates randomized test cases, calls your functions, and if the result isn’t valid, tries to find a minimal example that produces the crash. The full range of functionality provided by PropEr is beyond the scope here, but the User Guide provides a good entry point.

Using PropEr In Crossbar

Crossbar’s architecture defines a single Webmachine resource module to process a REST request. The resource module handles the generic and common bits of the request, while the specific bits for different requests reside in modules tuned for that functionality. When the resource needs more specific instruction, it passes a Payload and Context to the Bindings server with a Routing Key to be compared to the known bindings. Those that match are given a chance to process the routing request and return a possibly modified Payload.

The matching of known bindings to a given routing key is the functionality we were most interested in testing. The rules for a binding key are the same as in the RabbitMQ binding keys; a period-delimited string with two wildcards available: ‘*’ matches one and only one segment; ‘#’ matches zero or more segments. It is the ‘#’ that makes life troublesome for verifying that a matching algorithm will correctly indicate if a binding matches a routing key.

Being that PropEr was publicly released at the London Erlang Factory in June, our experience level with the tool was minimal. In discussions with MononcQc (Fred Hebert of Learn You Some Erlang fame) in #erlounge on Freenode, he was able to provide insights and a PropEr test case to help test our matching function.

We’ll take a top-down approach to deconstructing what’s going on in the PropEr match test.

*Please note*: Links to the code and code snippets are provided in the  following paragraphs, but are only as current as when this post was last edited. As our knowledge of PropEr testing increases, changes may occur to both the tests and the matching function. Please look

prop_expands/0

{code}

prop_expands() ->

    ?FORALL(Paths, expanded_paths(),

          ?WHENFAIL(io:format(“Failed on ~p~n”,[Paths]),

                  lists:all(fun(X) -> X end, %% checks if all true

                        [binding_matches(Pattern, Expanded) =:= Expected ||

                            {Pattern, Expanded, Expected} <- Paths])

                 )).

{code}

Paths is the result of tests generated in the call to expanded_paths(). We iterate (?FORALL macro) over the list of test cases, run the matching function, and compare our result to the expected. The ?WHENFAIL macro prints a debug line for when the given Paths fails. Each element in the Paths list is a 3-tuple containing the binding (Pattern), the routing key (Expanded), and a boolean indicating if Pattern should match Expanded. The expanded_paths() tries to generate two of these 3-tuples, one that fails and one that should succeed. This isn’t always the case, as will be explained later.

Ok, so we generate a binding and two routing keys, one that should fail and one that should succeed. If both return as expected, the test passes. PropEr can be configured with how many of these iterations to run; in this case, we end the testing when 15 seconds has elapsed, after running many thousands of tests or until it finds a combination that doesn’t work.

expanded_paths/0

{code}

expanded_paths() ->

    ?LET(P, path(),

       begin

           B = list_to_binary(P),

           ?LET({{Expanded1, IsRight1},{Expanded2, IsRight2}},

              {wrong(P), right(P)},

              [{B, list_to_binary(Expanded1), IsRight1},

               {B, list_to_binary(Expanded2), IsRight2}])

       end).

{code}

expanded_paths/0 is our top-level generator. P is assigned values from the path() generator and is converted to a binary because the test cases are lists and the matching algorithm uses binaries. The insight was to build the binding patterns (using path()) and loop over those random patterns to build expanded forms that would fail/succeed. We pass the binding pattern to the wrong/right functions to generate those expanded forms.

So far so good.

wrong/0 and right/0

{code}

wrong(Path) ->

    ?LET(P, Path, wrong(P, true, [])).

right(Path) ->

    ?LET(P, Path, {right1(P), true}).

{code}

Hopefully self-explanatory; pass the binding to helper functions and return the 2-tuple with an expanded form and whether it should match the given binding.

wrong/3

{code}

wrong([], Bool, Acc) ->

    {lists:reverse(Acc), Bool};

wrong(“*.#.” ++ Rest, Bool, Acc) -> %% the # messes stuff up, can’t invalidate

    wrong(Rest, Bool, Acc);

wrong(“*.#”, Bool, Acc) ->  %% same as above, end of string

    {lists:reverse(Acc), Bool};

wrong(“*.” ++ Rest, _Bool, Acc) ->

    wrong(Rest, false, Acc);

wrong(“.*”, _Bool, Acc) ->

    {lists:reverse(Acc), false};

wrong(“.#.” ++ Rest, Bool, Acc) -> %% can’t make this one wrong

    wrong(Rest, Bool, [$.|Acc]);

wrong(“#.” ++ Rest, Bool, Acc) -> %% same, start of string

    wrong(Rest, Bool, Acc);

wrong(“.#”, Bool, Acc) -> %% same as above, end of string

    {lists:reverse(Acc), Bool};

wrong([Char|Rest], Bool, Acc) when Char =/= $*, Char =/= $# ->

    wrong(Rest, Bool, [Char|Acc]).

{code}

Here is where the ‘#’ wildcard rears its head and makes this function not guaranteed to return a failing expanded string. Because ‘#’ matches zero or more segments, a pattern with ‘*.#’ in it makes generating a failing pattern difficult without some form of look-ahead. Since we’re generating a lot of these, it’s not really worth implementing the look-ahead at this point.

The clauses themselves aren’t too difficult to follow (hopefully).

right/1

{code}

right1([]) -> [];

right1(“*” ++ Rest) ->

    ?LET(S, segment(), S++right1(Rest));

right1(“.#” ++ Rest) ->

    ?LET(X,

       union([

            ””,

            ?LAZY(?LET(S, segment(), [$.]++S)),

            ?LAZY(?LET({A,B}, {segment(), segment()}, [$.]++A++[$.]++B)),

            ?LAZY(?LET({A,B,C}, {segment(), segment(), segment()}, [$.]++A++[$.]++B++[$.]++C))

             ]),

       X ++ right1(Rest));

right1(“#.” ++ Rest) ->

    ?LET(X,

       union([

            ””,

            ?LAZY(?LET(S, segment(), S++[$.])),

            ?LAZY(?LET({A,B}, {segment(), segment()}, A++[$.]++B++[$.])),

            ?LAZY(?LET({A,B,C}, {segment(), segment(), segment()}, A++[$.]++B++[$.]++C++[$.]))

             ]),

       X ++ right1(Rest));

right1([Char|Rest]) ->

    [Char|right1(Rest)].

{code}

We only make substitutions for the wildcards here that we know will be successful. The second clause replaces a ‘*’ with one result of segment(). The third clause inserts 0-3 segments in place of a ‘#’ appearing in the middle or end of the binding. The fourth clause handles when ‘#’ appears at the beginning of the binding and again inserts 0-3 segments. The last clause puts an exact match on the expanded result.

The generators, path/0, a/0, b/0, c/0, segment/0, and markers/0

{code}

path() ->

    ?LET(Base, ?LAZY(weighted_union([{3,a()}, {1,b()}])),

       ?LET({H,T}, {union([“*.”,”#.”,”“]), union([“.*”,”.#”,”“])},

            H ++ Base ++ T)).

a() ->

    ?LET({X,Y}, {segment(), ?LAZY(union([b(), markers()]))},

       X ++ [$.] ++ Y).

b() ->

    ?LET({X,Y}, {segment(), ?LAZY(union([b(), c()]))},

       X ++ [$.] ++ Y).

c() ->

    segment().

segment() ->

    ?SUCHTHAT(

       X,

       list(union([choose($a,$z), choose($A,$Z), choose($0,$9)])),

       length(X) =/= 0

      ).

markers() ->

    ?LET(S, ?LAZY(union([[$#, $., c()], [$*, $., b()]])), lists:flatten(S)).

{code}

path(), we now see, shows off some of the powerful macros and functions PropEr exposes to build complex generators. I won’t detail what they do as the link at the top provides the documentation. However, the names of the functions should give you a pretty good idea of how the test bindings are generated.

Running the tests

To run the whole test suite, navigate to $WHISTLE/whistle_apps/apps/crossbar and execute ‘../bin/rebar eunit’(crossbar may fail if you don’t have the whapps container running, crossbar started, and auth turned off.

To run the PropEr test in the whapps VM, comment out the “-ifdef(…).” and “-endif.” at the bottom of the file and recompile the module. In the VM, run:

{code}proper:quickcheck(crossbar_bindings:prop_expands()){code}

PropEr runs 100 tests by default. There’s a host of parameters to pass to quickcheck/2, including running more than the default 100 (you can just pass the number of tests to run as the second parameter as well). If you find any bindings that don’t pass, manually verify that the expectation matches your validation (that the given binding should or should not fail to match the routing key), then file a ticket. Alternatively, you can fix it by forking the github repo, fixing, and issuing us a pull request.

Wrapping up

PropEr is a powerful tool to help you find corners in your code that aren’t properly handled. There is a learning curve, but the power of the tests make the effort worth it. Remember, though, that just because you passed a round of PropEr testing, this does not guarantee that your code is bullet-proof. PropEr brings more advanced testing to the table, but that is for your discovery and hopefully a later article here on Crosstalk!

Many thanks to the PropEr team and to Fred Hebert for helping build the PropEr tests and donating his time and effort to help our project along. We hope by documenting real-world uses of PropEr and other tools, we can show the community and developers at large practical and powerful ways to improve their codebases.

- James Aimonetti

  • 11 months ago
  • Permalink
  • Share
    Tweet
Leak Alert

Welcome

Welcome to Crosstalk, the 2600Hz engineering blog where we leak our technical thoughts, discussions and banter to the rest of the world—you. Our intention is to provide our readers with real world information about dealing with the technical challenges we come across in our daily engineering tasks. Given the ambition, breadth, and scope of the 2600Hz project, we think this blog will serve many different kinds of engineers and enrich the collective intelligence as a complement to our corporate blog which will continue to be updated with less technical blog posts that share more about the company.

Initial Technology Themes

We do not plan to list all of the potential topics we may be covering as we expect and intend to grow and push the boundaries as much as possible. That said, our main focuses will be related to:

  1. Carrier Interconnectivity
  2. VoIP, and in particular the SIP protocol
  3. Provisioning and deploying servers, virtualized and dedicated
  4. DNS
  5. The Erlang programming language
  6. Distributed databases
  7. Streaming Media
  8. Building and Maintaining an Open Source project and community

And that’s just to get us started!

Especially for Programmers

As we write about our coding experiences, we will be relating the lessons learned to the actual progress of our open source projects blue.box and Whistle (you can find those repositories here). We hope to show how we do things in practice and not just from a theoretical standpoint. 

We also hope that having the exposure to our software and the rationale behind it will attract programmers to contribute to all levels of our stack: the core of Whistle, the applications (or WhApps), the API and general documentation (which can be found on our Wiki), the blue.box core and blue.box modules. 

Getting Started

Look for our first engineering post soon: a technical overview of the Whistle architecture. This will be a good introduction aimed at technical people (not necessarily programmers) interested in building a scalable VoIP PBX platform using the WhApps included with Whistle. We will cover the basic setup of the servers, the creation and configuration of accounts, devices, callflows, voicemail, and resources and finish with registering two SIP phones and calling each other. There will be a bonus section for singing up for DIDs and SIP trunks via our Trunkstore if you so choose (all proceeds go to supporting our project and keeping us fed); you are not required to use our trunks if you have carriers you’d prefer to use. 

Thanks in Advance

We are committed to being transparent about our platform and hope you will join us for the journey. Feedback is always welcome (let’s keep it civil). You can almost always find us on IRC (Freenode in #2600hz) or join the mailing list. For commercial support, email sales@2600hz.com to talk about how we can help you build or grow a business. 

Posted by: James Aimonetti, Senior Software Engineer

  • 11 months ago
  • 4
  • Permalink
  • Share
    Tweet

Logo

About

We build scalable, next-generation VoIP solutions that power phone companies and cloud services. Our main expertise is SIP, FreeSWITCH and SaaS Specifically, Software Development Consulting and Professional Services, Distributed VoIP Solutions, SMS and Video Integration.
  • RSS
  • Random
  • Archive
  • Ask me anything
  • Mobile

Effector Theme by Carlo Franco.

Powered by Tumblr